I've got a JSON object in my logs that shows up as the following:
"result":{
"totalRecords":8,
"bot":3,
"member":5,
"message":0,
"reaction":0,
"success":0,
"error":0,
"unknown":8
}
I'm trying to write a logs insights query to graph the values of each of those keys. Essentially I want a line chart with a different line for the value of each of the keys. Currently I have my query as the following:
fields result.bot, result.error, result.member, result.message, result.reaction,
result.success, result.totalRecords, result.unknown
| stats count(result.bot), count(result.error),
count(result.member),count(result.message),
count(result.reaction),count(result.success),
count(result.totalRecords), count(result.unknown) by bin(30s)
This returns the count of how many times the keys show up in the logs, but not the values.
What I need to know is what you use to get the value of a given key. I tried appending a .0 for example count(result.totalRecords.0) as was suggested in the AWS docs but it doesn't return any value. What is the query for the value of a key?
Based on documentation
Counts the log events. count() (or count(*)) counts all events returned by the query, while count(fieldName) counts all records that include the specified field name.
You can write instead
stats sum(result.bot), sum(result.error) by bin (30s)
etc. This will give you sum of those values over 30s periods. You can shorten the period if you want greater granularity.
Related
I have a DynamoDB table. I have an index on the "Id" field.
I also have other fields - status, timestamp, etc, but I don't have index on these other fields.
The status can be "online", "offline", "in-progress", etc.
I want to get the count of the records based on "status" and "Id" field.
The user will pass the Id field and the query needs to return the count based on the status field. e.g.
"online" : 20
"offline" : 30
"in-progress" : 40
The query works fine.
As I understand, the maximum size of the DynamoDb query output is 1 MB. This limit applies before any FilterExpression is applied to the results.
Since the number of records in the table are huge, ( around 100k), I need to execute the queries again and again by passing "Exclusive Start key" parameter.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Query.html#Query.Pagination
In fact, I need to run multiple queries (one for each status value) in the loop, for calculating the counts based on "status" field.
Is there any efficient way to retrieve theses counts?
I am thinking of extending the index to the status field also. So it will eliminate the need for applying filter expression.
If the field isn't indexed, you need to do a table scan to get the full count. You can parallelize the scan to make it faster, or just index it.
There are fields ScannedCount and Count yet even if field is indexed, you will get count of items only when result of query is less than 1MB.
If you have a lot of rows or single row is big, max size per row may be up to 400KB, so if you have rows of 400KB, you may scan only couple of such before hitting 1MB limit and you will get count of those. If you have small rows, you will be able to scan through more during single query. Yet in any case DynamoDB will not scan all the data to give you results on one go. You will get paginated results.
With proper index your query won't need use filters, w/o good index you will do index-scan or table-scan probably with applied filters but it does nothing to work around the fact - query will always scan up to 1MB of data and will return paginated results.
From the docs:
ScannedCount — The number of items that matched the key condition
expression before a filter expression (if present) was applied.
Count — The number of items that remain after a filter expression (if
present) was applied.
If the size of the Query result set is larger than 1 MB, ScannedCount
and Count represent only a partial count of the total items. You need
to perform multiple Query operations to retrieve all the results.
Each Query response contains the ScannedCount and Count for the items
that were processed by that particular Query request. To obtain grand
totals for all of the Query requests, you could keep a running tally
of both ScannedCount and Count.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Query.html
It seems a silly question. For the returned result from a dynamodb query, it has Items and Count. Items is an array which has a length property. I would like to ask are Items.length and Count always the same?
I am using javascript SDK.
https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/DynamoDB/DocumentClient.html#query-property
Yes, the length of Items and the Count should be the same.
A other count fun facts:
Each Query response will contain the ScannedCount and Count for the items that were processed by that particular Query request. To obtain grand totals for all of the Query requests, you could keep a running tally of both ScannedCount and Count.
If the size of the Query result set is larger than 1 MB, then ScannedCount and Count will represent only a partial count of the total items. You will need to perform multiple Query operations in order to retrieve all of the results (see Paginating the Results).
Also, if you just care about the count and not the data, you can ask DynamoDB to only return the count via the Select property of the request.
In DynamoDB is there a way to guarantee that exactly n results will be
returned if I specify a limit and a filter?
The problem I see is that the docs state:
In a response, DynamoDB returns all the matching results within the
scope of the Limit value. For example, if you issue a Query or a Scan
request with a Limit value of 6 and without a filter expression,
DynamoDB returns the first six items in the table that match the
specified key conditions in the request (or just the first six items
in the case of a Scan with no filter). If you also supply a
FilterExpression value, DynamoDB will return the items in the first
six that also match the filter requirements (the number of results
returned will be less than or equal to 6).
So this means 6 items will be retrieved and then the filter applied. How can I keep searching until I get exactly '6' items? (Ideally there is some setting in the query to keep going until the limit has been reached -- or exhaustion has been reached)
For example, Suppose I make a query to get 50 people, who's name is "john", Dynamo would return 50 people and then apply the "john" filter. Now only 3 people are returned.
Is there a way I can ensure it will keep searching until the limit of 50 is satisfied?
I don't want to use a Scan since a Scan always searches every item in the table (regardless of limit -- correct me if I'm wrong on this).
How can I make the query's filter lazily until the Limit is satisfied? How can I keep searching until the Limit is satisfied?
If you can filter in the query itself, then that'll be best, since you wouldn't have to use a filter expression. But if you can't, the way dynamo works I suspect means the filter is just a scan over the results - basically a way to save on bandwidth, not much more. You can still use pagination to get more results; and if you're using Dynamo you probably care about the rate in which you're querying, so having that control over how many queries you're actually doing (and their size) is kind of a good thing.
I have a sample database in CouchDB with the information of a number of aircraft, and a view which shows the manufacturer as key and the model as the value.
The map function is
function(doc) {
emit(doc["Manufacturer"], doc._id)
}
and the reduce function is
function(keys, values, rereduce){
return values.length;
}
This is pretty simple. And I indeed get the correct result when I show the view using Futon, where I have 26 aircraft of Boeing:
"BOEING" 26
But if I use a REST client to query the view using
http://localhost:6060/aircrafts/_design/basic/_view/VendorProducts?key="BOEING"
I get
{"rows":[
{"key":null,"value":2}
]}
I have tested different clients (including web browser, REST client extensions, and curl), all give me the value 2! While queries with other keys work correctly.
Is there something wrong with the MapReduce function or my query?
The issue could be because of grouping
Using group=true (which is Futon's default), you get a separate reduce value for each unique key in the map - that is, all values which share the same key are grouped together and reduced to a single value.
Were you passing group=true as a query parameter when querying with curl etc? Since it is passed by default in futon you saw the results like
BOEING : 26
Where as without group=true only the reduced value was being returned.
So try this query
http://localhost:6060/aircrafts/_design/basic/_view/VendorProducts?key="BOEING"&group=true
You seem to be falling into the re-reduce-trap. Couchdb strictly speaking uses a map-reduce-rereduce process.
Map: reformats your data in the output format.
Reduce: aggregates the data of several (but not all entries with the same key) - which works correctly in your case.
Re-reduce: does the same as reduce, but on previously reduced data.
As you change the format of the value in the reduce stage, the re-reduce call will aggregate the number of already reduced values.
Solutions:
You can just set the value in the map to 1 and reduce a sum of the values.
You check for rereduce==true and in that case return a sum of the values - which will be the integer values returned by the initial reduce.
jpg
In that Picture i have colored one part. i have attribute called "deviceModel". It contains more than one value.. i want to take using query from my domain which ItemName() contains deviceModel attribute values more than one value.
Thanks,
Senthil Raja
There is no direct approach to get what you are asking.. You need to manipulate by writing your own piece of code. By running SELECT query you will get the item Attribute-value pair. So here you need to traverse each each itemName() and count values of your desire attribute.
I think what you are refering to is called MultiValued Attributes. When you put a value in the attribute - if you don't replace the existing attribute value the values will multiply, giving you an array of items connected to the value of that attribute name.
How you create them will depend on the sdk/language you are using for your REST calls, however look for the Replace=true/false when you set the attribute's value.
Here is the documentation page on retrieving them: http://docs.amazonwebservices.com/AmazonSimpleDB/latest/DeveloperGuide/ (look under Using Amazon SimpleDB -> Using Select to Create Amazon SimpleDB Queries -> Queries on Attributes with Multiple Values)