elasticsearch showing only 1 docs.count on data migration using logstash - amazon-web-services
I am trying to move data from S3 (.csv file's data) to elastic search cluster using logstash using custom templete.
But it only shows docs.count=1 and rest of the records as docs.deleted when i check using following query in Kibana:-
GET /_cat/indices?v
My first question is :-
why only one record [the last one] is transmitted and others are transmitted as deleted ?
Now when I query this index using below query in Kibana :-
GET /my_file_index/_search
{
"query": {
"match_all": {}
}
}
I get only one record with comma separated data in "message" : field, So the second question is :-
How can I get the data with column names just like in csv as I have specified all column mappings in my template file which is fed into logstash ?
I tried giving columns field in logstash csv filter also but no luck.
columns => ["col1", "col2",...]
Any help would be appreciated.
EDIT-1: below is my logstash.conf file:-
input {
s3{
access_key_id => "xxx"
secret_access_key => "xxxx"
region => "eu-xxx-1"
bucket => "xxxx"
prefix => "abc/stocks_03-jul-2018.csv"
}
}
filter {
csv {
separator => ","
columns => ["AAA","BBB","CCC"]
}
}
output {
amazon_es {
index => "my_r_index"
document_type => "my_r_index"
hosts => "vpc-totemdev-xxxx.eu-xxx-1.es.amazonaws.com"
region => "eu-xxxx-1"
aws_access_key_id => 'xxxxx'
aws_secret_access_key => 'xxxxxx+xxxxx'
document_id => "%{id}"
template => "templates/template_2.json"
template_name => "my_r_index"
}
}
Note:
Version of logstash : 6.3.1
Version of elasticsearch : 6.2
EDIT:-2 Adding template_2.json file along with sample csv header :-
1. Mapping file :-
{
"template" : "my_r_index",
"settings" : {
"index" : {
"number_of_shards" : 50,
"number_of_replicas" : 1
},
"index.codec" : "best_compression",
"index.refresh_interval" : "60s"
},
"mappings" : {
"_default_" : {
"_all" : { "enabled" : false },
"properties" : {
"SECURITY" : {
"type" : "keyword"
},
"SERVICEID" : {
"type" : "integer"
},
"MEMBERID" : {
"type" : "integer"
},
"VALUEDATE" : {
"type" : "date"
},
"COUNTRY" : {
"type" : "keyword"
},
"CURRENCY" : {
"type" : "keyword"
},
"ABC" : {
"type" : "integer"
},
"PQR" : {
"type" : "keyword"
},
"KKK" : {
"type" : "keyword"
},
"EXPIRYDATE" : {
"type" : "text",
"index" : "false"
},
"SOMEID" : {
"type" : "double",
"index" : "false"
},
"DDD" : {
"type" : "double",
"index" : "false"
},
"EEE" : {
"type" : "double",
"index" : "false"
},
"FFF" : {
"type" : "double",
"index" : "false"
},
"GGG" : {
"type" : "text",
"index" : "false"
},
"LLL" : {
"type" : "double",
"index" : "false"
},
"MMM" : {
"type" : "double",
"index" : "false"
},
"NNN" : {
"type" : "double",
"index" : "false"
},
"OOO" : {
"type" : "double",
"index" : "false"
},
"PPP" : {
"type" : "text",
"index" : "false"
},
"QQQ" : {
"type" : "integer",
"index" : "false"
},
"RRR" : {
"type" : "double",
"index" : "false"
},
"SSS" : {
"type" : "double",
"index" : "false"
},
"TTT" : {
"type" : "double",
"index" : "false"
},
"UUU" : {
"type" : "double",
"index" : "false"
},
"VVV" : {
"type" : "text",
"index" : "false"
},
"WWW" : {
"type" : "double",
"index" : "false"
},
"XXX" : {
"type" : "double",
"index" : "false"
},
"YYY" : {
"type" : "double",
"index" : "false"
},
"ZZZ" : {
"type" : "double",
"index" : "false"
},
"KNOCKORWARD" : {
"type" : "text",
"index" : "false"
},
"RANGEATSSPUT" : {
"type" : "double",
"index" : "false"
},
"STDATMESSPUT" : {
"type" : "double",
"index" : "false"
},
"CONSENSUPUT" : {
"type" : "double",
"index" : "false"
},
"CLIENTLESSPUT" : {
"type" : "double",
"index" : "false"
},
"KNOCKOUESSPUT" : {
"type" : "text",
"index" : "false"
},
"RANGACTOR" : {
"type" : "double",
"index" : "false"
},
"STDDACTOR" : {
"type" : "double",
"index" : "false"
},
"CONSCTOR" : {
"type" : "double",
"index" : "false"
},
"CLIENTOR" : {
"type" : "double",
"index" : "false"
},
"KNOCKOACTOR" : {
"type" : "text",
"index" : "false"
},
"RANGEPRICE" : {
"type" : "double",
"index" : "false"
},
"STANDARCE" : {
"type" : "double",
"index" : "false"
},
"NUMBERICE" : {
"type" : "integer",
"index" : "false"
},
"CONSECE" : {
"type" : "double",
"index" : "false"
},
"CLIECE" : {
"type" : "double",
"index" : "false"
},
"KNOCICE" : {
"type" : "text",
"index" : "false"
},
"SKEWICE" : {
"type" : "text",
"index" : "false"
},
"WILDISED" : {
"type" : "text",
"index" : "false"
},
"WILDATUS" : {
"type" : "text",
"index" : "false"
},
"RRF" : {
"type" : "double",
"index" : "false"
},
"SRF" : {
"type" : "double",
"index" : "false"
},
"CNRF" : {
"type" : "double",
"index" : "false"
},
"CTRF" : {
"type" : "double",
"index" : "false"
},
"RANADDLE" : {
"type" : "double",
"index" : "false"
},
"STANDANSTRADDLE" : {
"type" : "double",
"index" : "false"
},
"CONSLE" : {
"type" : "double",
"index" : "false"
},
"CLIDLE" : {
"type" : "double",
"index" : "false"
},
"KNOCKOADDLE" : {
"type" : "text",
"index" : "false"
},
"RANGEFM" : {
"type" : "double",
"index" : "false"
},
"SMIUM" : {
"type" : "double",
"index" : "false"
},
"CONIUM" : {
"type" : "double",
"index" : "false"
},
"CLIEEMIUM" : {
"type" : "double",
"index" : "false"
},
"KNOREMIUM" : {
"type" : "text",
"index" : "false"
},
"COT" : {
"type" : "double",
"index" : "false"
},
"CLIEEDSPOT" : {
"type" : "double",
"index" : "false"
},
"IME" : {
"type" : "keyword"
},
"KKE" : {
"type" : "keyword"
}
}
}
}
}
My excel content as:-
Header : Actual header is quite lengthy as have lot many columns, please consider other column names similar to below in continuation.
SECURITY | SERVICEID | MEMBERID | VALUEDATE...
First row : Again column values as below some columns has blank values , I have mentioned above real template file (in mapping file above) which has all column values.
KKK-LMN 2 1815 6/25/2018
PPL-ORL 2 1815 6/25/2018
SLB-ORD 2 1815 6/25/2018
3. Kibana query output
Query :
GET /my_r_index/_search
{
"query": {
"match_all": {}
}
}
Outout:
{
"_index": "my_r_index",
"_type": "my_r_index",
"_id": "IjjIZWUBduulDsi0vYot",
"_score": 1,
"_source": {
"#version": "1",
"message": "XXX-XXX-XXX-USD,2,3190,2018-07-03,UNITED STATES,USD,300,60,Put,2042-12-19,,,,.009108041,q,,,,.269171754,q,,,,,.024127966,q,,,,68.414017367,q,,,,.298398645,q,,,,.502677959,q,,,,,0.040880692400344164,q,,,,,,,159.361792143,,,,.631296636,q,,,,.154877384,q,,42.93,N,Y,\n",
"#timestamp": "2018-08-23T07:56:06.515Z"
}
},
...Other similar records as above.
EDIT-3:
Sample output after using autodetect_column_names => true :-
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 10,
"successful": 10,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "indr",
"_type": "logs",
"_id": "hAF1aWUBS_wbCH7ZG4tW",
"_score": 1,
"_source": {
"2": "2",
"1815": "1815",
"message": """
PPL-ORD-XNYS-USD,2,1815,6/25/2018,UNITED STATES
""",
"SLB-ORD-XNYS-USD": "PPL-ORD-XNYS-USD",
"6/25/2018": "6/25/2018",
"#timestamp": "2018-08-24T01:03:26.436Z",
"UNITED STATES": "UNITED STATES",
"#version": "1"
}
},
{
"_index": "indr",
"_type": "logs",
"_id": "kP11aWUBctDorPcGHICS",
"_score": 1,
"_source": {
"2": "2",
"1815": "1815",
"message": """
SLBUSD,2,1815,4/22/2018,UNITEDSTATES
""",
"SLB-ORD-XNYS-USD": "SLBUSD",
"6/25/2018": "4/22/2018",
"#timestamp": "2018-08-24T01:03:26.436Z",
"UNITED STATES": "UNITEDSTATES",
"#version": "1"
}
},
{
"_index": "indr",
"_type": "logs",
"_id": "j_11aWUBctDorPcGHICS",
"_score": 1,
"_source": {
"2": "SERVICE",
"1815": "CLIENT",
"message": """
UNDERLYING,SERVICE,CLIENT,VALUATIONDATE,COUNTRY
""",
"SLB-ORD-XNYS-USD": "UNDERLYING",
"6/25/2018": "VALUATIONDATE",
"#timestamp": "2018-08-24T01:03:26.411Z",
"UNITED STATES": "COUNTRY",
"#version": "1"
}
}
]
}
}
I'm pretty certain your single document has an id of %{id}. The first problem comes from the fact that in your CSV file, you are not extracting a column whose name is id and that's what you're using in document_id => "%{id}" hence all rows are getting indexed with the id %{id} and each indexation deletes the previous. At the end, you have a single document which has been indexed as many times as the rows in your CSV.
Regarding the second issue, you need to fix the filter section like below:
filter {
csv {
separator => ","
autodetect_column_names => true
}
date {
match => [ "VALUATIONDATE", "M/dd/yyyy" ]
}
}
Also you need to fix your index template like this (I've only added the format setting in the VALUATIONDATE field:
{
"order": 0,
"template": "helloindex",
"settings": {
"index": {
"codec": "best_compression",
"refresh_interval": "60s",
"number_of_shards": "10",
"number_of_replicas": "1"
}
},
"mappings": {
"_default_": {
"_all": {
"enabled": false
},
"properties": {
"UNDERLYING": {
"type": "keyword"
},
"SERVICE": {
"type": "integer"
},
"CLIENT": {
"type": "integer"
},
"VALUATIONDATE": {
"type": "date",
"format": "MM/dd/yyyy"
},
"COUNTRY": {
"type": "keyword"
}
}
}
},
"aliases": {}
}
Related
The BigQueryInsertJobOperator in Airflow does not create a table
I'm trying to setup an Airflow job that executes a BigQuery job by calling the BigQueryInsertJobOperator operator that should create a table to store the results of a query if it doesnt exist. The setup looks like this: task3 = BigQueryInsertJobOperator( task_id="item_data", project_id="project_id", configuration={ "jobType" : "QUERY", "query" : { "query" : "{% include 'sql_query.sql' %}", "useLegacySql" : False }, "tableDefinitions" : { "fields" : [ { "name" : "DEPT_NBR", "type" : "INTEGER" }, { "name" : "ITEM_NBR", "type" : "INTEGER" }, { "name" : "CREATED_DATE", "type" : "STRING" } ] }, "destinationTable" : { "projectId" : "project_id", "datasetId" : "dataset_id", "tableId" : "table_id" }, "createDisposition" : "CREATE_IF_NEEDED", "writeDisposition" : "WRITE_APPEND", "priority" : "BATCH", "schemaUpdateOptions" : [ "ALLOW_FIELD_ADDITION" ], "timePartitioning" : { "type" : "DAY", "expirationMs" : 31556926000, "field" : "CREATED_DATE" }, "clustering" : { "fields" : [ "DEPT_NBR" ] } }, impersonation_chain="svc-account#project_id.iam.gserviceaccount.com", location="US" ) Everything executes perfectly but it does not create the table. When I check the logs, what I'm seeing is that it's storing the data in a temporary table with an expiration date of 24 hours and despite setting the priority to BATCH it's still running as INTERACTIVE. Any thoughts?
A level is missing in your configuration : task3 = BigQueryInsertJobOperator( task_id="item_data", project_id="project_id", configuration={ "query": { "query": "{% include 'sql_query.sql' %}", "useLegacySql": False, "destinationTable": { "projectId": "project_id", "datasetId": "dataset_id", "tableId": "table_id" }, "createDisposition": "CREATE_IF_NEEDED", "writeDisposition": "WRITE_APPEND", "priority": "BATCH", "schemaUpdateOptions": [ "ALLOW_FIELD_ADDITION" ], "timePartitioning": { "type": "DAY", "expirationMs": 31556926000, "field": "CREATED_DATE" }, "clustering": { "fields": [ "DEPT_NBR" ] }, "tableDefinitions": { "fields": [ { "name": "DEPT_NBR", "type": "INTEGER" }, { "name": "ITEM_NBR", "type": "INTEGER" }, { "name": "CREATED_DATE", "type": "STRING" } ] } } }, impersonation_chain="svc-account#project_id.iam.gserviceaccount.com", location="US") There is a parent node query and the other options are put inside.
elasticsearch v5 template to v6
I am currently running elasticsearch cluster version 6.3.1 on AWS and here is template file which I need to upload but can't ``` { "template" : "logstash-*", "settings" : { "index.refresh_interval" : "5s" }, "mappings" : { "_default_" : { "_all" : {"enabled" : true, "omit_norms" : true}, "dynamic_templates" : [ { "message_field" : { "match" : "message", "match_mapping_type" : "string", "mapping" : { "type" : "string", "index" : "analyzed", "omit_norms" : true, "fielddata" : { "format" : "enabled" } } } }, { "string_fields" : { "match" : "*", "match_mapping_type" : "string", "mapping" : { "type" : "string", "index" : "analyzed", "omit_norms" : true, "fielddata" : { "format" : "enabled" }, "fields" : { "raw" : {"type": "string", "index" : "not_analyzed", "doc_values" : true, "ignore_above" : 256} } } } }, { "float_fields" : { "match" : "*", "match_mapping_type" : "float", "mapping" : { "type" : "float", "doc_values" : true } } }, { "double_fields" : { "match" : "*", "match_mapping_type" : "double", "mapping" : { "type" : "double", "doc_values" : true } } }, { "byte_fields" : { "match" : "*", "match_mapping_type" : "byte", "mapping" : { "type" : "byte", "doc_values" : true } } }, { "short_fields" : { "match" : "*", "match_mapping_type" : "short", "mapping" : { "type" : "short", "doc_values" : true } } }, { "integer_fields" : { "match" : "*", "match_mapping_type" : "integer", "mapping" : { "type" : "integer", "doc_values" : true } } }, { "long_fields" : { "match" : "*", "match_mapping_type" : "long", "mapping" : { "type" : "long", "doc_values" : true } } }, { "date_fields" : { "match" : "*", "match_mapping_type" : "date", "mapping" : { "type" : "date", "doc_values" : true } } }, { "geo_point_fields" : { "match" : "*", "match_mapping_type" : "geo_point", "mapping" : { "type" : "geo_point", "doc_values" : true } } } ], "properties" : { "#timestamp": { "type": "date", "doc_values" : true }, "#version": { "type": "string", "index": "not_analyzed", "doc_values" : true }, "geoip" : { "type" : "object", "dynamic": true, "properties" : { "ip": { "type": "ip", "doc_values" : true }, "location" : { "type" : "geo_point", "doc_values" : true }, "latitude" : { "type" : "float", "doc_values" : true }, "longitude" : { "type" : "float", "doc_values" : true } } } } } } }' I tried loading the template via Dev Tools in Kibana and got the following error { "error": { "root_cause": [ { "type": "mapper_parsing_exception", "reason": "Failed to parse mapping [_default_]: No field type matched on [float], possible values are [object, string, long, double, boolean, date, binary]" } ], "type": "mapper_parsing_exception", "reason": "Failed to parse mapping [_default_]: No field type matched on [float], possible values are [object, string, long, double, boolean, date, binary]", "caused_by": { "type": "illegal_argument_exception", "reason": "No field type matched on [float], possible values are [object, string, long, double, boolean, date, binary]" } }, "status": 400 } Can somebody please help with what I need to do to have this working on version 6 elasticsearch. I am completely new to elasticsearch and am just looking to setup logging from cloudtrail -> s3 -> AWS elasticsearch -> kibana.
In order to work on 6.3, the correct mapping for the logstash index would need to be (taken from here): { "template" : "logstash-*", "version" : 60001, "settings" : { "index.refresh_interval" : "5s" }, "mappings" : { "_default_" : { "dynamic_templates" : [ { "message_field" : { "path_match" : "message", "match_mapping_type" : "string", "mapping" : { "type" : "text", "norms" : false } } }, { "string_fields" : { "match" : "*", "match_mapping_type" : "string", "mapping" : { "type" : "text", "norms" : false, "fields" : { "keyword" : { "type": "keyword", "ignore_above": 256 } } } } } ], "properties" : { "#timestamp": { "type": "date"}, "#version": { "type": "keyword"}, "geoip" : { "dynamic": true, "properties" : { "ip": { "type": "ip" }, "location" : { "type" : "geo_point" }, "latitude" : { "type" : "half_float" }, "longitude" : { "type" : "half_float" } } } } } } }
Error uploading swagger file to API Manager v2.5.0
I'm testing the currrent version of wso2 API Manager (2.5.0) and I've a problem with my current swagger files that I've already imported in the version 2.2.0. The error message is: "The HTTP method 'parameters' provided for resource '/tasks/{taskid}' is invalid": at org.wso2.carbon.apimgt.impl.utils.APIUtil.handleException(APIUtil.java:1411) at org.wso2.carbon.apimgt.impl.definitions.APIDefinitionFromOpenAPISpec.getURITemplates(APIDefinitionFromOpenAPISpec.java:124) at org.wso2.carbon.apimgt.hostobjects.APIProviderHostObject.jsFunction_updateAPIDesign(APIProviderHostObject.java:969) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.mozilla.javascript.MemberBox.invoke(MemberBox.java:126) ... 69 more This is the API sample from studio.restlet.com: { "swagger" : "2.0", "info" : { "description" : "An API for managing a list of tasks that need to be done. \n\nDon't forget to take it for a spin by clicking on the **Try in Client** button next to each operation! All read operations are public and don't require authentication.\n", "version" : "1.1.0", "title" : "Tasks API", "termsOfService" : "", "contact" : { } }, "host" : "tasksapi.restlet.net", "basePath" : "/v1", "schemes" : [ "https" ], "consumes" : [ "application/json" ], "produces" : [ "application/json" ], "paths" : { "/tasks/" : { "get" : { "summary" : "Load the list of Tasks", "parameters" : [ { "name" : "$size", "in" : "query", "required" : false, "type" : "integer", "description" : "Size of the page to retrieve.", "x-example" : 10 }, { "name" : "$page", "in" : "query", "required" : false, "type" : "integer", "description" : "Number of the page to retrieve.", "x-example" : 1 }, { "name" : "$sort", "in" : "query", "required" : false, "type" : "string", "description" : "Order in which to retrieve the results. Multiple sort criteria can be passed. Example: sort=age ASC,height DESC", "x-example" : "createdAt DESC" }, { "name" : "id", "in" : "query", "required" : false, "type" : "string", "description" : "Allows to filter the collection of results by the value of field `id`", "x-example" : "47ee3550-b619-11e6-8408-0bdb025a7cfa" }, { "name" : "name", "in" : "query", "required" : false, "type" : "string", "description" : "Allows to filter the collection of results by the value of field `name`", "x-example" : "Learn about hypermedia APIs" }, { "name" : "createdAt", "in" : "query", "required" : false, "type" : "string", "description" : "Allows to filter the collection of results by the value of field `createdAt`", "x-example" : "2016.07.03" }, { "name" : "completed", "in" : "query", "required" : false, "type" : "boolean", "description" : "Allows to filter the collection of results by the value of field `completed`", "x-example" : true } ], "responses" : { "200" : { "description" : "Status 200", "schema" : { "type" : "array", "items" : { "$ref" : "#/definitions/Task" } }, "examples" : { "application/json" : "[{\n \"id\": \"47ee3550-b619-11e6-8408-0bdb025a7cfa\",\n \"name\": \"Feed the fish\",\n \"completed\": false,\n \"createdAt\": \"2016.07.03\"\n}]" }, "headers" : { "X-Page-Count" : { "type" : "integer", "x-example" : 1 }, "X-Page-Number" : { "type" : "integer", "x-example" : 1 }, "X-Page-Size" : { "type" : "integer", "x-example" : 25 }, "X-Total-Count" : { "type" : "integer", "x-example" : 2 } } }, "400" : { "description" : "Status 400", "schema" : { "$ref" : "#/definitions/Error" } } } }, "post" : { "summary" : "Create a new Task", "consumes" : [ ], "parameters" : [ { "name" : "body", "in" : "body", "required" : true, "schema" : { "$ref" : "#/definitions/Task" }, "x-examples" : { "application/json" : "{\n \"name\": \"Feed the fish\",\n \"completed\": false,\n \"createdAt\": \"2016.07.03\"\n}" } } ], "responses" : { "200" : { "description" : "Status 200", "schema" : { "$ref" : "#/definitions/Task" }, "examples" : { "application/json" : "{\n \"id\": \"47ee3550-b619-11e6-8408-0bdb025a7cfa\",\n \"name\": \"Feed the fish\",\n \"completed\": false,\n \"createdAt\": \"2016.07.03\"\n}" } } }, "security" : [ { "HTTP_BASIC" : [ ] } ] } }, "/tasks/{taskid}" : { "get" : { "summary" : "Load a specific Task", "parameters" : [ ], "responses" : { "200" : { "description" : "Status 200", "schema" : { "$ref" : "#/definitions/Task" }, "examples" : { "application/json" : "{\n \"id\": \"47ee3550-b619-11e6-8408-0bdb025a7cfa\",\n \"name\": \"Feed the fish\",\n \"completed\": false,\n \"createdAt\": \"2016.07.03\"\n}" } }, "400" : { "description" : "Status 400", "schema" : { "$ref" : "#/definitions/Error" } } } }, "put" : { "summary" : "Update a Task", "consumes" : [ ], "parameters" : [ { "name" : "body", "in" : "body", "required" : true, "schema" : { "$ref" : "#/definitions/Task" }, "x-examples" : { "application/json" : "{\n \"name\": \"Feed the fish\",\n \"completed\": false,\n \"createdAt\": \"2016.07.03\"\n}" } } ], "responses" : { "200" : { "description" : "Status 200", "schema" : { "$ref" : "#/definitions/Task" }, "examples" : { "application/json" : "{\n \"id\": \"47ee3550-b619-11e6-8408-0bdb025a7cfa\",\n \"name\": \"Feed the fish\",\n \"completed\": false,\n \"createdAt\": \"2016.07.03\"\n}" } } }, "security" : [ { "HTTP_BASIC" : [ ] } ] }, "delete" : { "summary" : "Delete a Task", "parameters" : [ ], "responses" : { "200" : { "description" : "Status 200" } }, "security" : [ { "HTTP_BASIC" : [ ] } ] }, "parameters" : [ { "name" : "taskid", "in" : "path", "required" : true, "type" : "string", "description" : "Identifier of the Task", "x-example" : "47ee3550-b619-11e6-8408-0bdb025a7cfa" } ] } }, "securityDefinitions" : { "HTTP_BASIC" : { "description" : "All GET methods are public, meaning that *you can read all the data*. Write operations require authentication and therefore are forbidden to the general public.", "type" : "basic" } }, "definitions" : { "Task" : { "type" : "object", "required" : [ "completed", "id", "name" ], "properties" : { "id" : { "type" : "string", "description" : "Auto-generated primary key field", "example" : "3fa2eb40-b61c-11e6-9de0-fdbe71bceebb" }, "name" : { "type" : "string", "example" : "Figure out how to colonize Mars" }, "completed" : { "type" : "boolean" }, "createdAt" : { "type" : "string", "example" : "2016.10.06" } }, "description" : "An object that represents a Task.", "example" : "{\n \"id\": \"47ee3550-b619-11e6-8408-0bdb025a7cfa\",\n \"name\": \"Feed the fish\",\n \"completed\": false,\n \"createdAt\": \"2016.07.03\"\n}" }, "Error" : { "type" : "object", "required" : [ "code" ], "properties" : { "code" : { "type" : "integer", "minimum" : 400, "maximum" : 599 }, "description" : { "type" : "string", "example" : "Bad query parameter [$size]: Invalid integer value [abc]" }, "reasonPhrase" : { "type" : "string", "example" : "Bad Request" } }, "description" : "This general error structure is used throughout this API.", "example" : "{\n \"code\": 400,\n \"description\": \"Bad query parameter [$size]: Invalid integer value [abc]\",\n \"reasonPhrase\": \"Bad Request\"\n}" } } }
This is fixed in APIManager 2.6. Please refer attached screen capture video for verification. Please refer https://github.com/wso2/product-apim/issues/3560 for developer testing record.
Elastic Search Index template for NGINX custom log
I have the following log from NGINX: 111.111.111.111, 11.11.11.11 - 11.11.11.11 [06/May/2016:08:26:10 +0000] "POST /some-service/GetSomething HTTP/1.1" 499 0 "-" "Jakarta Commons-HttpClient/3.1" "7979798797979799" 59.370 - "{\x0A\x22correlationId\x22 : \x22TestCorr1\x22\x0A}" Logstash will be like this: input { stdin {} } output { stdout { codec => "rubydebug" } } filter { grok { match => { "message" => "%{COMBINEDAPACHELOG} %{QS:partner_id} %{NUMBER:req_time} %{GREEDYDATA:extra_fields}" } add_field => [ "received_at", "%{#timestamp}" ] add_field => [ "received_from", "%{host}" ] } mutate { gsub => ["extra_fields", "\"","", "extra_fields", "\\x0A","", "extra_fields", "\\x22",'\"', "extra_fields", "(\\)","" ] } json { source => "extra_fields" target => "extra_fields_json" } mutate { add_field => { "correlationId" => "%{[extra_fields_json][correlationId]}" } } } The problem is req_time is string, so I need to convert to float using the following template: { "template" : "filebeat*", "settings" : { "index.refresh_interval" : "5s" }, "mappings" : { "properties" : { "#timestamp": { "type": "date" }, "partner_id": { "type": "string", "index": "not_analyzed" }, "#version": { "type": "string", "index": "not_analyzed" }, "req_time" : { "type" : "float", "index" : "not_analyzed" }, "res_time" : { "type" : "string", "index" : "not_analyzed" }, "purchaseTime" : { "type" : "date", "index" : "not_analyzed" }, "received_at" : { "type" : "date", "index" : "not_analyzed" }, "itemPrice" : { "type" : "double", "index" : "not_analyzed" }, "total" : { "type" : "integer", "index" : "not_analyzed" }, "bytes" : { "type" : "double", "index" : "not_analyzed" } } } } } Verified using: curl -XGET 'http://localhost:9200/filebeat-2016.06.30/_mapping/field/req_time' I am getting: {"filebeat-2016.06.30":{"mappings":{"nginxlog":{"req_time": {"full_name":"req_time","mapping":{"req_time":{"type":"string"}}}}}}} so my template definitely does not work. Anyone can help?
At the end, I just removed the template, and let ES guest the field type. It did work.
Using RegEx in JSON Schema
Trying to write a JSON schema that uses RegEx to validate a value of an item. Have an item named progBinaryName whose value should adhrere to this RegEx string "^[A-Za-z0-9 -_]+_Prog\\.(exe|EXE)$". Can not find any tutorials or examples that actually explain the use of RegEx in a JSON schema. Any help/info would be GREATLY appreciated! Thanks, D JSON SCHEMA { "name": "string", "properties": { "progName": { "type": "string", "description": "Program Name", "required": true }, "ID": { "type": "string", "description": "Identifier", "required": true }, "progVer": { "type": "string", "description": "Version number", "required": true }, "progBinaryName": { "type": "string", "description": "Actual name of binary", "patternProperties": { "progBinaryName": "^[A-Za-z0-9 -_]+_Prog\\.(exe|EXE)$" }, "required": true } } } ERRORS: Warning! Better check your JSON. Instance is not a required type - http://json-schema.org/draft-03/hyper-schema# Schema is valid JSON, but not a valid schema. Validation results: failure [ { "level" : "warning", "schema" : { "loadingURI" : "#", "pointer" : "" }, "domain" : "syntax", "message" : "unknown keyword(s) found; ignored", "ignored" : [ "name" ] }, { "level" : "error", "domain" : "syntax", "schema" : { "loadingURI" : "#", "pointer" : "/properties/ID" }, "keyword" : "required", "message" : "value has incorrect type", "expected" : [ "array" ], "found" : "boolean" }, { "level" : "error", "domain" : "syntax", "schema" : { "loadingURI" : "#", "pointer" : "/properties/progBinaryName" }, "keyword" : "required", "message" : "value has incorrect type", "expected" : [ "array" ], "found" : "boolean" }, { "level" : "error", "schema" : { "loadingURI" : "#", "pointer" : "/properties/progBinaryName/patternProperties/progBinaryName" }, "domain" : "syntax", "message" : "JSON value is not a JSON Schema: not an object", "found" : "string" }, { "level" : "error", "domain" : "syntax", "schema" : { "loadingURI" : "#", "pointer" : "/properties/progName" }, "keyword" : "required", "message" : "value has incorrect type", "expected" : [ "array" ], "found" : "boolean" }, { "level" : "error", "domain" : "syntax", "schema" : { "loadingURI" : "#", "pointer" : "/properties/progVer" }, "keyword" : "required", "message" : "value has incorrect type", "expected" : [ "array" ], "found" : "boolean" } ] Problem with schema#/properties/progBinaryName/patternProperties/progBinaryName : Instance is not a required type Reported by http://json-schema.org/draft-03/hyper-schema# Attribute "type" (["object"])
To test a string value (not a property name) against a RegEx, you should use the "pattern" keyword: { "type": "object", "properties": { "progBinaryName": { "type": "string", "pattern": "^[A-Za-z0-9 -_]+_Prog\\.(exe|EXE)$" } } } P.S. - if you want the pattern to match the key for the property (not the value), then you should use "patternProperties" (it's like "properties", but the key is a RegEx).
Your JSON schema syntax is incorrect. Change "patternProperties": { "progBinaryName": "^[A-Za-z0-9 -_]+_Prog\\.(exe|EXE)$" } to "patternProperties": { "^[A-Za-z0-9 -_]+_Prog\\.(exe|EXE)$": {} }