Mapping Elasticsearch with Python

Mapping Elasticsearch with Python - python-2.7

I have JSON docs (referred to below as "file_content") that have strings that I want all set to not_analyzed. I am trying to accomplish this as such:
if not es.indices.exists(index='telemfile'):
es.indices.create('telemfile') # Create the index to work with
namapping = {
"mappings": {
"telemetry_file": {
"dynamic_templates": [{
"string_fields": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "not_analyzed"
}
}
}]
}
}
}
es.indices.put_mapping(index='telemfile', doc_type='telemetry_file', body=namapping)
es.index(index='telemfile', doc_type='telemetry_file', body=file_content)
but I get the following error:
MapperParsingException[Root type mapping not empty after parsing! Remaining fields: [mappings : {telemetry_file={dynamic_templates=[{string_fields={mapping={type=string, index=not_analyzed}, match=*, match_mapping_type=string}}]}}]];
Can anyone tell me what I am doing wrong?

I removed the "mappings": object from above, and it works.

Related

ElasticSearch reindexing with selected fields result into addition of non selected empty field

Scenario:
We are using AWS ElasticSearch 6.8. We got an index (index-A) with a mapping structure consist of multiple nested objects and JSON hierarchy. We need to create new index (index-B) and move all documents from index-A to index-B.
We need to create index-B with only specific fields.
We need to rename field names while reindexing
e.g.
index-A mapping:
{
"userdata": {
"properties": {
"payload": {
"type": "object",
"properties": {
"Alldata": {
"Username": {
"type": "keyword"
},
"Designation": {
"type": "keyword"
},
"Company": {
"type": "keyword"
},
"Region": {
"type": "keyword"
}
}
}
}
}
}}
Expected structure of index-B mapping after reindexing with rename (Company-cnm, Region-rg) :-
{
"userdata": {
"properties": {
"cnm": {
"type": "keyword"
},
"rg": {
"type": "keyword"
}
}
}}
Steps we are Following:
First we are using Create index API to create index-B with above mapping structure
Once index is created we are creating an ingest pipeline.
PUT ElasticSearch domain endpoint/_ingest/pipeline/my_rename_pipeline
{
"description": "rename field pipeline",
"processors": [{
"rename": {
"field": "payload.Company",
"target_field": "cnm",
"ignore_missing": true
}
},
{
"rename": {
"field": "payload.Region",
"target_field": "rg",
"ignore_missing": true
}
}
]
}
Perform reindexing operation, payload for the same below
let reindexParams = {
wait_for_completion: false,
slices: "auto",
body: {
"conflicts": "proceed",
"source": {
"size": 8000,
"index": "index-A",
"_source": ["payload.Company", "payload.Region"]
},
"dest": {
"index": "index-B",
"pipeline": "my_rename_pipeline",
"version_type": "external"
}
}
};
Problem:
Once the reindexing is complete as expected all documents transferred to new index with renamed fields but there is one additional field which is not selected. As you can see below the "payload" object with metadata is also added to the new index after reindexing. This field is empty and consist of no data.
index-B looks like below after reindexing:
{
"userdata": {
"properties": {
"cnm": {
"type": "keyword"
},
"rg": {
"type": "keyword"
},
"payload": {
"properties": {
"Alldata": {
"type": "object"
}
}
}
}
}}
We are unable to find the workaround and need help how to stop this field from creating. Any help will be appreciated.

Great job!! You're almost there, you simply need to remove the payload field within your pipeline using the remove processor and you're good:
{
"description": "rename field pipeline",
"processors": [
{
"rename": {
"field": "payload.Company",
"target_field": "cnm",
"ignore_missing": true
}
},
{
"rename": {
"field": "payload.Region",
"target_field": "rg",
"ignore_missing": true
}
},
{
"remove": { <--- add this processor
"field": "payload"
}
}
]
}

Elasticsearch object mapping for tried to parse field [null] as object, but found a concrete value

How can I change mapping or my input to resolve these error, using elasticsearch on AWS,
Mapping:
{
"index_patterns": ["*-students-log"],
"mappings": {
"properties": {
"Data": {
"type": "object",
"properties": {
"PASSED": {
"type": "object"
}
}
},
"insertion_timestamp": {
"type": "date",
"format": "epoch_second"
}
}
}
}
My data :
curl -XPOST -H 'Content-Type: application/json' https://******.us-east-1.es.amazonaws.com/index_name/_doc/1 -d '{"Data": {"PASSED": ["Vivek"]},"insertion_timestamp": 1591962493}'
Error I got :
{"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"object mapping for [Data.PASSED] tried to parse field [null] as object, but found a concrete value"}],"type":"mapper_parsing_exception","reason":"object mapping for [Data.PASSED] tried to parse field [null] as object, but found a concrete value"},"status":400}
What is the missing or wrong piece in the above data? Is there any other datatype I should use for array of string?
Any help would be appreciated...

JSON arrays are not considered JSON objects when ingested into Elasticsearch.
The docs state the following regarding arrays:
There is no dedicated array datatype. Any field can contain zero or
more values by default, however, all values in the array must be of
the same datatype.
So, instead of declaring the whole array as an object, speficy the array entries' data type (text) directly:
PUT abc-students-log
{
"mappings": {
"properties": {
"Data": {
"type": "object",
"properties": {
"PASSED": {
"type": "text"
}
}
},
"insertion_timestamp": {
"type": "date",
"format": "epoch_second"
}
}
}
}
POST abc-students-log/_doc
{
"Data": {
"PASSED": [
"Vivek"
]
},
"insertion_timestamp": 1591962493
}

Regular Expressions and Elastic Search

I am trying to retrieve some company results using elasticsearch. I want to get companies that start with "A", then "B", etc. If I just do a pretty typical query with "prefix" like so
GET apple/company/_search
{
"query": {
"prefix": {
"name": "a"
}
},
"fields": [
"id",
"name",
"websiteUrl"
],
"size": 100
}
But this will return Acme as well as Lemur and Associates, so I need to distinguish between A at the beginning of the whole name versus just A at the beginning of a word.
It would seem like regular expressions would come to the rescue here, but elastic search just ignores whatever I try. In tests with other applications, ^[\S]a* should get you anything that starts with A that doesn't have a space in front of it. Elastic search returns 0 results with the following:
GET apple/company/_search
{
"query": {
"regexp": {
"name": "^[\S]a*"
}
},
"fields": [
"id",
"name",
"websiteUrl"
],
"size": 100
}
In FACT, the Sense UI for Elasticsearch will immediately alert you to a "Bad String Syntax Error". That's because even in a query elastic search wants some characters escaped. Nonetheless ^[\\S]a* doesn't work either.

Searching in Elasticsearch is both about the query itself, but also about the modelling of your data so it suits best the query to be used. One cannot simply index whatever and then try to struggle to come up with a query that does something.
The Elasticsearch way for your query is to have the following mapping for that field:
PUT /apple
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"keyword_lowercase": {
"type": "custom",
"tokenizer": "keyword",
"filter": [
"lowercase"
]
}
}
}
}
},
"mappings": {
"company": {
"properties": {
"name": {
"type": "string",
"fields": {
"analyzed_lowercase": {
"type": "string",
"analyzer": "keyword_lowercase"
}
}
}
}
}
}
}
And to use this query:
GET /apple/company/_search
{
"query": {
"prefix": {
"name.analyzed_lowercase": {
"value": "a"
}
}
}
}
or
GET /apple/company/_search
{
"query": {
"query_string": {
"query": "name.analyzed_lowercase:A*"
}
}
}

Setting "not_analyzed" for all strings in Elastic Search AWS managed instance does not work for me

I deleted all data, deleted my index, and ran the following command after verifying there are no other templates:
curl -XPUT https://search-xxxx.us-east-1.es.amazonaws.com/_template/all -d '
{
"template": "*",
"settings": {
"index.refresh_interval": "5s"
},
"mappings": {
"_default_": {
"dynamic_templates": [
{
"string_fields": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"index": "not_analyzed",
"omit_norms": true,
"type": "string"
}
}
}
],
"properties": {
"#version": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
'
I then proceeded to add some documents, and expected the string fields to not be analyzed, yet I still get the same behavior: a warning from kibana when doing a visualization that the field is analyzed (hence split) so results are messed up (which they obviously are).
Running the following before adding any data successfully added a "not_analyzed" index instruction to the oneFieldThatCould property, so this does work in a single property case, but I need this to be a general rule for all dynamically added prooerties:
curl -XPUT https://search-xxxx.us-east-1.es.amazonaws.com/production/_mapping/events -d '
{
"properties": {
"oneFieldThatCould": {
"index": "not_analyzed",
"type": "string"
}
}
}
'

This finally worked, I deleted the index & type & all templates & all data, and now all strings are created with analyzed off. Not sure there's a major difference between what I tried before and this code, but it works now so I'm not gonna argue :)
curl -XPUT https://search-xxxxx.us-east-1.es.amazonaws.com/_template/all -d '
{
"template": "*",
"settings": {},
"mappings": {
"_default_": {
"dynamic_templates": [
{
"strings": {
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "not_analyzed"
}
}
}
]
}
}
}
'

Elasticsearch : es.index() changes the Mapping when message is pushed

I am trying to push some messages like this to elasticsearch
id=1
list=asd,bcv mnmn,kjkj, pop asd dgf
so, each message has an id field which is a string, and a list field that contains a list of string values
when i push this into elastic and try to create charts in kibana, the default analyzer kicks in and splits my list by the space character. Hence it breaks up my values. I tried to create a mapping for my index as
mapping='''
{
"test":
{
"properties": {
"DocumentID": {
"type": "string"
},
"Tags":{
"type" : "string",
"index" : "not_analyzed"
}
}
}
}'''
es = Elasticsearch([{'host': server, 'port': port}])
indexName = "testindex"
es.indices.create(index=indexName, body=mapping)
so this should create the index with the mapping i defined. Now , i push the messages by simply
es.index(indexName, docType, messageBody)
but even now, Kibana breaks up my values! why was the mapping not applied ?
and when i do
GET /testindex/_mapping/test
i get
{
"testindex": {
"mappings": {
"test": {
"properties": {
"DocumentID": {
"type": "string"
},
"Tags": {
"type": "string"
}
}
}
}
}
}
why did the mapping change? How can i specify the mapping type when i do
es.index()

You were very close. You need to provide the root mappings object while creating the index and you dont need it when using _mapping end point and that is the reason put_mapping worked and create did not. You can see that in api.
mapping = '''
{
"mappings": {
"test": {
"properties": {
"DocumentID": {
"type": "string"
},
"Tags": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
'''
Now this will work as expected
es.indices.create(index=indexName, body=mapping)
Hope this helps

i was able to get the correct mapping to work by
es.indices.create(index=indexName)
es.indices.put_mapping(docType, mapping, indexName)
i dont understand why
es.indices.create(index=indexName, body=mapping)
did not work. this should have worked as per the API.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Mapping Elasticsearch with Python - python-2.7

I removed the "mappings": object from above, and it works.

Related

ElasticSearch reindexing with selected fields result into addition of non selected empty field

Elasticsearch object mapping for tried to parse field [null] as object, but found a concrete value

Regular Expressions and Elastic Search

Setting "not_analyzed" for all strings in Elastic Search AWS managed instance does not work for me

Elasticsearch : es.index() changes the Mapping when message is pushed

Categories

Resources