Kibana/Elastic Regex Query Returns No Results - regex

We have Logstash receiving syslog files and then storing these in an Elasticsearch index.
We are trying to query this index with Kibana to find some particular information but we cannot get the regex queries to work.
The log date we are trying to search within is below.
Field name = message
Field type = keyword
<14>1 2018-05-02T13:53:48.079000Z snrvro04 vco - - [liagent#6876
anctoken="" component="WorkflowManagementServiceImpl" context=""
filepath="/var/log/vco/app-server/integration-server.log"
instanceid="6a6dbf1d-2f72-45db-ab57-04b84aa97b90"
log_message="Workflow 'Get ID of
Workflow/8f59ca66-7472-4efa-ac5f-dfc34059c5f1' updated (with
content)." priority="INFO" product="vro" token="" user="" wfid=""
wfname="" wfstack=""] 2018-05-02 13:53:48.079+0000 vco:
[component="WorkflowManagementServiceImpl" priority="INFO"
thread="https-jsse-nio-0.0.0.0-8281-exec-7" user="" context=""
token="" wfid="" wfname="" anctoken="" wfstack=""
instanceid="6a6dbf1d-2f72-45db-ab57-04b84aa97b90"] Workflow 'Get ID of
Workflow/8f59ca66-7472-4efa-ac5f-dfc34059c5f1' updated (with content).
The information we are trying to search for is:
component="WorkflowManagementServiceImpl"
AND more importantly:
Workflow 'Get ID of Workflow/8f59ca66-7472-4efa-ac5f-dfc34059c5f1'
The top criteria should always be the same, but the Workflow name and ID will change. The only part that remains the same within this bit of text is Workflow ' and the final '
We are currently trying our queries against the Workflow name and ID to see if we can match on that, but our queries return no results.
The regex we currently have is as follows, and we have tried numerous alternatives.
/(?<=Workflow '.*\/)(.*')/
If we run the search * Workflow * (wildcard, without the spaces) - it returns everything with the word Workflow as expected.
If we run the search Workflow we get no results.
If anyone can provide pointers towards where we are going wrong, or getting confused, that would be great!
Thanks

We resolved this by using Grok filters in Logstash to organise/clean the data before it hits the Elasticsearch Indexes, then we were able to search successfully within Kibana.

Related

Dynamic query (Current date) via web services in Power Bi

In my project we are consuming the company's data via Web Service REST. Today we don't do the query dynamically by passing the start date and end date parameters via string.
enter image description here
My goal is for the end date to update dynamically. I've already created a query that takes the current date but I can't put it in the parameter without generating an error in the query.
enter image description here
This is the error message I get when I put the column value in the parameter:
enter image description here
I'm pretty sure I'm getting the syntax wrong. Anyone who can help me, I really appreciate it. I would like to point out that the date format for the API call to work is DD/MM/YYYY.
Can you try using
PutYourOtherTableNameHere[Hoje_Coluna]{0}
instead of
[Hoje_Coluna]
?
To see if that will work, put this in right before your query, then click on the step and see what it returns.
x = PutYourOtherTableNameHere[Hoje_Coluna]{0},

Kibana: can I store "Time" as a variable and run a consecutive search?

I want to automate a few search in one, here are the steps:
Search in Kibana for this ID:"b2c729b5-6440-4829-8562-abd81991e2a0" which will return me a bunch of logs. Of these logs I need to take the first and the last timestamp:
I now would like to store these two data FROM: September 3rd 2019, 21:28:22.155, TO: September 3rd 2019, 21:28:23.524 in 2 variables
Run a second search in Kibana for the word "fail" in between these two variable of time
How to automate the whole process without need of copy/paste and running a second query?
EDIT:
SHORT STORY LONG: I work in a company that produce a software for autonomous vehicles.
SCENARIO: A booking is rejected and we need to understand why.
WHERE IS THE PROBLE: I need to monitor just a few seconds of logs on 3 different machines. Each log is completely separated, there is no relation between the logs so I cannot write a query in discover, I need to run 3 separated queries.
EXAMPLE:
A booking was rejected, so I open Chrome and I search on "elk-prod.myhost.com" for the BookingID:"b2c729b5-6440-4829-8562-abd81991e2a0" and I have a dozen of logs returned during a range of 2 seconds (FROM: September 3rd 2019, 21:28:22.155, TO: September 3rd 2019, 21:28:23.524).
Now I need to know what was happening on the car so I open a new Chrome tab and I search on "elk-prod.myhost.com" for the CarID: "Tesla-45-OU" on the time range FROM: September 3rd 2019, 21:28:22.155, TO: September 3rd 2019, 21:28:23.524
Now I need to know why the server which calculate the matching rejected the booking so I open a new Chrome tab and I search for the word CalculationMatrix always on the time range FROM: September 3rd 2019, 21:28:22.155, TO: September 3rd 2019, 21:28:23.524
CONCLUSION: I want to stop to keep opening Chrome tabs by hand and automate the whole thing. I have no idea around what time the book was made so I first need to search for the BookingID "b2c729b5-6440-4829-8562-abd81991e2a0", then store the timestamp of first and last log and run a second and third query based on those timestamps.
There is no relation between the 3 logs I search so there is no way to filter from the Discover, I need to automate 3 different query.
Here is how I would do it. First of all, from what I understand, you have three different indexes:
one for "bookings"
one for "cars"
one for "matchings"
First, in Discover, I would create three Saved Searches, one per index pattern. Then in Visualize, I would create a Vertical bar chart on the bookings saved search (Bucket X-Axis by date_histogram on the timestamp field, leave the rest as is). You'll get a nice histogram of all your booking events bucketed by time.
Finally, I would create a dashboard and add the vertical bar chart + those three saved searches inside it.
When done, the way I would search according to the process you've described above is as follows:
Search for the booking ID b2c729b5-6440-4829-8562-abd81991e2a0 in the top filter bar. In the bar chart histogram (bookings), you will see all documents related to the selected booking. On that chart, you can select the exact period from when the very first booking document happened to the very last. This will adapt the main time picker at the top and the start/end time will be "remembered" by Kibana
Remove the booking ID from the top filter (since we now know the time range and Kibana stores it). Search for Tesla-45-OU in the top filter bar. The bar histogram + the booking saved search + the matchings saved search will be empty, but you'll have data inside the second list, the one for cars. Find whatever you need to find in there and go to the next step.
Remove the car ID from the top filter and search for ComputationMatrix. Now the third saved search is going to show you whatever documents you need to see within that time range.
I'm lacking realistic data to try this out, but I definitely think this is possible as I've laid out above, probably with some adaptations.
Kibana does work like this (any order is ok):
Select time filter: https://www.elastic.co/guide/en/kibana/current/set-time-filter.html
Add additional criteria for search like for example field s is b2c729b5-6440-4829-8562-abd81991e2a0.
Add aditional criteria for search like for example field x is Fail.
Additionaly you can view surrounding documents https://www.elastic.co/guide/en/kibana/current/document-context.html#document-context
This is how Kibana works.
You can prepare some filters beforehands, save them and then use them if you want to automate the process of discovering somehow.
You can do that in Discover tab in Kibana using New/Save/Open options.
Edit:
I do not think you can achieve what you need in Kibana. As I mentioned earlier one option is to change the data that is comming to Elasticsearch so you can search for it via discover in Kibana. Another option could be builiding for example Java application, that is using Elasticsearch - then you can write algorithm that returns the data that you want. But i think it's a big overhead and I recommend checking the data first.
Edit: To clarify - you can create external Java let's say SpringBoot application that uses Elasticsearch - all the data that you need is inside it.
But in this option you will not use Kibana at all.
You can export the result to csv or what you want in the code.
SpringBoot application can ask ElasticSearch for whatever it needs, then it would be easy to store these time variables inside of Java code.
EDIT: After OP edited question to change it dramatically:
#FrancescoMantovani Well the edited version is very different from where you first posted here How to automate the whole process without need of copy/paste and running a second query? and search for word fail in a single shot. In accepted answer you are still using a three filters one at a time so it is not one search, but three.
What's more if you would use one index, and send data from multiple hosts via filebeat you don't even to have to create this dashboard to do that. Then you can you can select the exact period from when the very first document happened to the very last regarding filter and then remove it and add another filter that you need - it's simple as that. Before you were writing about one query,
How to automate the whole process without need of copy/paste and
running a second query?
not three. And you don't need to open new tab in Chrome each time you want to change filter just organize the data by for example using filebeat as mentioned before.
There is no relation between the 3 logs
From what you wrote the realation exist and it is time.
If the data is in for example three diferent indicies (cause documents don't have much similiar data) you can do it like that:
You change them easily in dicover see:
You can go to discover select index 1 search, select time range that you need, when you change index the time range is still the one you selected, you only need to change filter - you will get what you need.

Google Stackdriver Log Based Metrics: how to extract values using a regular expression from the log line

I have log lines of the following form in my Google Cloud Console:
Updated blacklist info about 123 minions. max_blacklist_per_minion=20, median_blacklist_per_minion=8, blacklist_free_minions=31
And I'm trying to set up some log-based metrics to get a longer-term overview of the values (ie. how are they changing? is it lower or higher than yesterday? etc).
However I didn't find any examples for this scenario in the documentation and what I could think of doesn't seem to work. Specifically I'm trying to understand what I need to select in "Field name" to have access to the log line (so that I can write a regular expression against).
I tried textPayload but that seems to be empty for this log entry. Looking at the actual log entry there should also be a protoPayload.line[0], but that doesn't seem to work either
In the "Metric Editor" built into the logs viewer UI you can use "protoPayload.line.logMessage" as the field name. For some reason the UI doesn't want to suggest 'line' (seems like a bug; same behavior in the filter box).
The log based metric won't distinguish based on the index of the app log line, so something like 'line[0]' won't work. For a distribution all values are extracted. A count metric would count the log entry (ie 1 regardless the number of 'line' matches).

Index as a NumericFields using Sitecore Advanced Database Crawler

I needed to search a price field using lucene range query. However the results it gives are not accurate or consistent since I am using a TermRangeQuery in Lucene.Net API. I believe that using NumericRangeQuery I could get accurate results. To use NumericRangeQuery the field needs to be indexed using NumericField. Is there a way I can do this with Advanced Database Crawler.
I tried to do this by altering the Advanced Database Crawler source code but it is not working for me.
These are the changes I have done in Advanced Database Crawler. In scSearchContrib.Crawler.Crawlers.AdvancedDatabaseCrawler class in the CreateField method I have added the following code.
if (name.EndsWith("numeric"))
{
field = new NumericField(name, storageType, true);
}
in the index configuration I have given the field name name and appended the text "numeric" to it. However I am correctly passing the fieldname by removing the "numeric" part.
when building the index I get a error like this.
Job started: RebuildSearchIndex|System.NullReferenceException: Object reference not set to an instance of an object.
at Lucene.Net.Store.IndexOutput.WriteString(String s)
at Lucene.Net.Index.FieldsWriter.WriteField(FieldInfo fi, Fieldable field)
at Lucene.Net.Index.StoredFieldsWriterPerThread.AddField(Fieldable field, FieldInfo fieldInfo)
at Lucene.Net.Index.DocFieldProcessorPerThread.ProcessDocument()
at Lucene.Net.Index.DocumentsWriter.UpdateDocument(Document doc, Analyzer analyzer, Term delTerm)
at Lucene.Net.Index.DocumentsWriter.AddDocument(Document doc, Analyzer analyzer)
at Lucene.Net.Index.IndexWriter.AddDocument(Document doc, Analyzer analyzer)
at Lucene.Net.Index.IndexWriter.AddDocument(Document doc)
at Sitecore.Search.IndexUpdateContext.AddDocument(Document document)
at Sitecore.Search.Crawlers.DatabaseCrawler.AddItem(Item item, IndexUpdateContext context)
at Sitecore.Search.Crawlers.DatabaseCrawler.AddTree(Item root, IndexUpdateContext context)
at Sitecore.Search.Crawlers.DatabaseCrawler.AddTree(Item root, IndexUpdateContext context)
at Sitecore.Search.Crawlers.DatabaseCrawler.AddTree(Item root, IndexUpdateContext context)
at Sitecore.Search.Index.Rebuild()
at Sitecore.Shell.Applications.Search.RebuildSearchIndex.RebuildSearchIndexForm.Builder.Build()|Job ended: RebuildSearchIndex (units processed: 1)
Can someone tell me a way to do this using Advanced Database Crawler?
Thanks in Advance
Even though I couldn't index as a numeric field found a work around for the problem. It is to index with padded zeros so the lucene TermRangeQuery give correct seach results. Every price is indexed with padded zeros so each value would contain 10 digits. That way the results I get are accurate.

String Ids are not quoted in dependent batch-request to api. Workaround?

I'm currently trying to query the facebook api to retrieve some data via batch-requests with two fql queries.
One of the queries fetches a set of album ids in the form of:
Select aid FROM album WHERE ...
While the other one tries to retrieve photos for the found albums:
SELECT ... FROM photo WHERE aid IN ({result=album_ids:$.*.aid})
Where 'album_ids' is the name of the first query.
Most of the time this works perfectly but sometimes a album comes along with an aid containing a '_' - Which would be perfectly fine since the documentation specifies the aid as string.
However the jsonpath in the second query does not quote the ids according to the facebook api:
Parser error: unexpected '_xxxxx' at position xx
...
SELECT ... FROM photo WHERE aid IN (10000xxxxxxxxxx_xxxxx)
The json result for the first query clearly has them quoted:
[{\"aid\":\"xxxxxxxxxxxxxxxxxxx\"},{\"aid\":\"10000xxxxxxxxxx_xxxxx\"},...]
Am i missing something here or does facebook wrongly skip to quote the ids in the second query even though they are clearly strings.
As far as i see in the facebook-api and jsonpath specs this should be working.
Or is there a work-around to get this to behave as expected? (Except of doing the quoting client-side and with two seperate requests).
Right now i'm trying to change my query as suggested here: Quoting/escaping jsonpath elements for in clause of dependent fql queries
But maybe there is a way without completely re-structuring the queries itself.