Im trying to restore the ElasticSearch snapshot which is taken from the AWS managed elastic search. Version 5.6. Instance type i3.2xlarge.
While restoring this on a VM, immediately the cluster status went to Red and all the shards are unassigned.
{
"cluster_name" : "es-cluster",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 8,
"number_of_data_nodes" : 5,
"active_primary_shards" : 0,
"active_shards" : 0,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 480,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 0.0
}
When I use the allocation explain API, I got this below response.
{
"node_id" : "3WEV1tHoRPm6OguKyxp0zg",
"node_name" : "node-1",
"transport_address" : "10.0.0.2:9300",
"node_decision" : "no",
"deciders" : [
{
"decider" : "replica_after_primary_active",
"decision" : "NO",
"explanation" : "primary shard for this replica is not yet active"
},
{
"decider" : "filter",
"decision" : "NO",
"explanation" : "node does not match index setting [index.routing.allocation.include] filters [instance_type:\"i2.2xlarge OR i3.2xlarge\"]"
},
{
"decider" : "throttling",
"decision" : "NO",
"explanation" : "primary shard for this replica is not yet active"
}
]
},
This is something strange and I never faced this. Anyhow the snapshot is done, How can I ignore this setting while restoring? Even I tried the below query but still the same issue.
curl -X POST "localhost:9200/_snapshot/restore/awsnap/_restore?pretty" -H 'Content-Type: application/json' -d'
{"ignore_index_settings": [
"index.routing.allocation.include"
]
}'
I found the cause and the solution.
Detailed troubleshooting steps are here https://thedataguy.in/restore-aws-elasticsearch-snapshot-failed-index-settings/
But leaving this comment here, so others can get benefit from it.
This is AWS specific thing, So I used this to solve it.
curl -X POST "localhost:9200/_snapshot/restore/awsnap/_restore?pretty" -H 'Content-Type: application/json' -d'
{"ignore_index_settings": [
"index.routing.allocation.include.instance_type"
]
}
'
Related
We have an OpenSearch domain on AWS.
Sometimes Cluster status and OpenSearch Dashboards health status goes into yellow for a few minutes which is fine I guess.
But today OpenSearch Dashboards health status went into red and is there for a few hours now. Everything else works except the dashboards, which gives error 503: {"Message":"Http request timed out connecting"}
{
"cluster_name" : "779754160511:telemetry",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 2,
"number_of_data_nodes" : 2,
"discovered_master" : true,
"discovered_cluster_manager" : true,
"active_primary_shards" : 166,
"active_shards" : 332,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
I finally tried updating to an instance with more RAM, but the status is still red.
How could I solve this? Is there a way to restart the domain or debug in someway?
I tried to follow this example https://docs.aws.amazon.com/neptune/latest/userguide/bulk-load-data.html to load data to neptune
curl X POST -H 'Content-Type: application/json' https://endpoint:port/loader -d '
{
"source" : "s3://source.csv",
"format" : "csv",
"iamRoleArn" : "role",
"region" : "region",
"failOnError" : "FALSE",
"parallelism" : "MEDIUM",
"updateSingleCardinalityProperties" : "FALSE",
"queueRequest" : "TRUE"
}'
{
"status" : "200 OK",
"payload" : {
"loadId" : "411ee078-3c44-4620-85ac-e22ef5466bbb"
}
And I get status 200 but then I try to check if the data was loaded and get this:
curl G 'https://endpoint:port/loader/411ee078-3c44-4620-85ac-e22ef5466bbb'
{
"status" : "200 OK",
"payload" : {
"feedCount" : [
{
"LOAD_FAILED" : 1
}
],
"overallStatus" : {
"fullUri" : "s3://source.csv",
"runNumber" : 1,
"retryNumber" : 1,
"status" : "LOAD_FAILED",
"totalTimeSpent" : 4,
"startTime" : 1617653964,
"totalRecords" : 10500,
"totalDuplicates" : 0,
"parsingErrors" : 0,
"datatypeMismatchErrors" : 0,
"insertErrors" : 10500
}
}
I had no idea why I get LOAD_FAILED so I decided to use get-status API to see what errors caused the load failure and got this:
curl -X GET 'endpoint:port/loader/411ee078-3c44-4620-85ac-e22ef5466bbb?details=true&errors=true'
{
"status" : "200 OK",
"payload" : {
"feedCount" : [
{
"LOAD_FAILED" : 1
}
],
"overallStatus" : {
"fullUri" : "s3://source.csv",
"runNumber" : 1,
"retryNumber" : 1,
"status" : "LOAD_FAILED",
"totalTimeSpent" : 4,
"startTime" : 1617653964,
"totalRecords" : 10500,
"totalDuplicates" : 0,
"parsingErrors" : 0,
"datatypeMismatchErrors" : 0,
"insertErrors" : 10500
},
"failedFeeds" : [
{
"fullUri" : "s3://source.csv",
"runNumber" : 1,
"retryNumber" : 1,
"status" : "LOAD_FAILED",
"totalTimeSpent" : 1,
"startTime" : 1617653967,
"totalRecords" : 10500,
"totalDuplicates" : 0,
"parsingErrors" : 0,
"datatypeMismatchErrors" : 0,
"insertErrors" : 10500
}
],
"errors" : {
"startIndex" : 1,
"endIndex" : 10,
"loadId" : "411ee078-3c44-4620-85ac-e22ef5466bbb",
"errorLogs" : [
{
"errorCode" : "FROM_OR_TO_VERTEX_ARE_MISSING",
"errorMessage" : "Either from vertex, '1414', or to vertex, '70', is not present.",
"fileName" : "s3://source.csv",
"recordNum" : 0
},
What does this error even mean and what is the possible fix?
It looks as if you were trying to load some edges. When an edge is loaded, the two vertices that the edge will be connecting must already have been loaded/created. The message:
"errorMessage" : "Either from vertex, '1414', or to vertex, '70',is not present.",
is letting you know that one (or both) of the vertices with ID values of '1414' and '70' are missing. All vertices referenced by a CSV file containing edges must already exist (have been created or loaded) prior to loading edges that reference them. If the CSV files for vertices and edges are in the same S3 location then the bulk loader can figure out the order to load them in. If you just ask the loader to load a file containing edges but the vertices are not yet loaded, you will get an error like the one you shared.
Transitioning to new AWS documentDB service. Currently, on Mongo 3.2. When I run db.collection.distinct("FIELD_NAME") it returns the results really quickly. I did a database dump to AWS document DB (Mongo 3.6 compatible) and this simple query just gets stuck.
Here's my .explain() and the indexes on the working instance versus AWS documentdb:
Explain function on working instance:
> db.collection.explain().distinct("FIELD_NAME")
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "db.collection",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [ ]
},
"winningPlan" : {
"stage" : "PROJECTION",
"transformBy" : {
"_id" : 0,
"FIELD_NAME" : 1
},
"inputStage" : {
"stage" : "DISTINCT_SCAN",
"keyPattern" : {
"FIELD_NAME" : 1
},
"indexName" : "FIELD_INDEX_NAME",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"FIELD_NAME" : [
"[MinKey, MaxKey]"
]
}
}
},
"rejectedPlans" : [ ]
},
Explain on AWS documentdb, not working:
rs0:PRIMARY> db.collection.explain().distinct("FIELD_NAME")
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "db.collection",
"winningPlan" : {
"stage" : "AGGREGATE",
"inputStage" : {
"stage" : "HASH_AGGREGATE",
"inputStage" : {
"stage" : "COLLSCAN"
}
}
}
},
}
Index on both of these instances:
{
"v" : 1,
"key" : {
"FIELD_NAME" : 1
},
"name" : "FIELD_INDEX_NAME",
"ns" : "db.collection"
}
Also, this database has a couple million documents but there are only about 20 distinct values for that "FIELD_NAME". Any help would be appreciated.
I tried it with .hint("index_name") and that didn't work. I tried clearing plan cache but I get Feature not supported: planCacheClear
COLLSCAN and IXSCAN don't have too much difference in this case, both need to scan all the documents or index entries.
I want to match multiple start strings in mongo. explain() shows that it's using the indexedfield index for this query:
db.mycol.find({indexedfield:/^startstring/,nonindexedfield:/somesubstring/});
However, the following query for multiple start strings is really slow. When I run explain I get an error. Judging by the faults I can see in mongostat (7k a second) it's scanning the entire collection. It's also alternating between 0% locked and 90-95% locked every few seconds.
db.mycol.find({indexedfield:/^(startstring1|startstring2)/,nonindexedfield:/somesubstring/}).explain();
JavaScript execution failed: error: { "$err" : "assertion src/mongo/db/key.cpp:421" } at src/mongo/shell/query.js:L128
Can anyone shed some light on how I can do this or what is causing the explain error?
UPDATE - more info
Ok, so I managed to get explain to work on the more complex query by limiting the number of results. The difference is this:
For a single substring, "^/BA1/" (yes it's postcodes)
"cursor" : "BtreeCursor pc_1 multi",
"isMultiKey" : false,
"n" : 10,
"nscannedObjects" : 10,
"nscanned" : 10,
"nscannedObjectsAllPlans" : 19,
"nscannedAllPlans" : 19,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
"indexedfield" : [
[
"BA1",
"BA2"
],
[
/^BA1/,
/^BA1/
]
]
}
For multiple substrings "^(BA1|BA2)/"
"cursor" : "BtreeCursor pc_1 multi",
"isMultiKey" : false,
"n" : 10,
"nscannedObjects" : 10,
"nscanned" : 1075276,
"nscannedObjectsAllPlans" : 1075285,
"nscannedAllPlans" : 2150551,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 5,
"nChunkSkips" : 0,
"millis" : 4596,
"indexBounds" : {
"indexedfield" : [
[
"",
{
}
],
[
/^(BA1|BA2)/,
/^(BA1|BA2)/
]
]
}
which doesn't look very good.
$or solves the problem in terms of using the indexes (thanks EddieJamsession). Queries are now lightening fast.
db.mycoll.find({$or: [{indexedfield:/^startstring1/},{indexedfield:/^startstring2/],nonindexedfield:/somesubstring/})
However, I would still like to do this with a regex if possible so I'm leaving the question open. Not least because I now have to refactor my application to take these types of queries into account.
The post seems long but is is only because of data (samples and errors).
I am trying to make a bucket mocking the buildFailed sample in cep 2.1.0. (This sample works).
I have created my own stream and my own sample data.
Yet it seams that the input handler of cep his having trouble with my events.
So far I have not found the issue.
The stream def :
{
"name":"eu.ima.event.stream",
"version": "1.2.0",
"nickName": "poc sample",
"description": "poc sample stream",
"metaData":[
{
"name":"host",
"type":"string"
}
],
"correlationData":[
{
"name":"processus",
"type":"string"
},
{
"name":"flux",
"type":"string"
},
{
"name":"reference",
"type":"string"
}
],
"payloadData":[
{
"name":"timestamp",
"type":"string"
},
{ "name":"code",
"type":"string"
},
{
"name":"category",
"type":"string"
},
{
"name":"msg",
"type":"string"
}
]
}
The events data :
[
{
"metaData" : ["192.168.1.2"] ,
"correlationData" : ["PSOR", "Appli2", "Ref-1"] ,
"payloadData" : ["1363700128138496600", "6", "BIZ", "6"]
}
,
{
"metaData" : ["192.168.1.2"] ,
"correlationData" : ["PSOR", "Appli2", "Ref-0"] ,
"payloadData" : ["1363700126353394500", "6", "BIZ", "6"]
}
,
{
"metaData" : ["192.168.1.2"] ,
"correlationData" : ["PSOR", "Appli2", "Ref-3"] ,
"payloadData" : ["1363700131731702100", "6", "BIZ", "6"]
}
,
{
"metaData" : ["192.168.1.2"] ,
"correlationData" : ["PSOR", "Appli2", "Ref-2"] ,
"payloadData" : ["1363700129894597000", "6", "BIZ", "6"]
}
,
{
"metaData" : ["192.168.1.2"] ,
"correlationData" : ["PSOR", "Appli2", "Ref-4"] ,
"payloadData" : ["1363700133472801700", "6", "BIZ", "6"]
}
]
When I send the streamdef, no error and no log except the admin connected
We might need more feedback here. I use the curl post command.
When I send the events I have errors :
[2013-03-19 14:58:00,586] ERROR {org.wso2.carbon.databridge.core.internal.queue.QueueWorker} - Error in passing event eventList [
Event{
streamId='eu.ima.event.stream:1.2.0',
timeStamp=0,
metaData=[192.168.1.2],
correlationData=[PSOR, Appli2, Ref-1],
payloadData=[1363700128138496600, 6, BIZ, 6],
arbitraryDataMap=null,
}
,
Event{
streamId='eu.ima.event.stream:1.2.0',
timeStamp=0,
metaData=[192.168.1.2],
correlationData=[PSOR, Appli2, Ref-0],
payloadData=[1363700126353394500, 6, BIZ, 6],
arbitraryDataMap=null,
}
,
Event{
streamId='eu.ima.event.stream:1.2.0',
timeStamp=0,
metaData=[192.168.1.2],
correlationData=[PSOR, Appli2, Ref-3],
payloadData=[1363700131731702100, 6, BIZ, 6],
arbitraryDataMap=null,
}
,
Event{
streamId='eu.ima.event.stream:1.2.0',
timeStamp=0,
metaData=[192.168.1.2],
correlationData=[PSOR, Appli2, Ref-2],
payloadData=[1363700129894597000, 6, BIZ, 6],
arbitraryDataMap=null,
}
,
Event{
streamId='eu.ima.event.stream:1.2.0',
timeStamp=0,
metaData=[192.168.1.2],
correlationData=[PSOR, Appli2, Ref-4],
payloadData=[1363700133472801700, 6, BIZ, 6],
arbitraryDataMap=null,
}
] to subscriber org.wso2.carbon.broker.core.internal.brokers.agent.AgentBrokerType$AgentBrokerCallback#2d7fbbd6
java.lang.NullPointerException
at org.wso2.carbon.cep.core.mapping.input.mapping.TupleInputMapping.getValue(TupleInputMapping.java:126)
at org.wso2.carbon.cep.core.mapping.input.mapping.TupleInputMapping.convertToEventTuple(TupleInputMapping.java:97)
at org.wso2.carbon.cep.core.mapping.input.mapping.InputMapping.convert(InputMapping.java:42)
at org.wso2.carbon.cep.core.listener.TopicEventListener.onEvent(TopicEventListener.java:50)
at org.wso2.carbon.cep.core.listener.BrokerEventListener.onEvent(BrokerEventListener.java:58)
at org.wso2.carbon.broker.core.internal.brokers.agent.AgentBrokerType$AgentBrokerCallback.receive(AgentBrokerType.java:176)
at org.wso2.carbon.databridge.core.internal.queue.QueueWorker.run(QueueWorker.java:80)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Please anyone, do you have any hints ?
I really need this to keep on with my Proof of concept CEP Project.
Best regards,
Cyril
I have gone through the details that you given above... But without the bucket configuration and complete error log it is hard to say what went wrong... But I have checked the stream definition and events that you have given above... It is working perfectly without any issue... I hope that you might made simple mistake when creating the bucket... Here I am sharing the bucket xml that I have created (note: change the email address in the output topic)
events json : link [1]
stream json : link [2]
bucket xml : link [3]
curl command for Stream :
curl -k --user admin:admin https://localhost:9443/datareceiver/1.0.0/streams/ --data #streamdefn2.json -H "Accept: application/json" -H "Content-type: application/json" -X POST
curl command for events :
curl -k --user admin:admin https://localhost:9443/datareceiver/1.0.0/stream/eu.ima.event.stream/1.2.0/ --data #events2.json -H "Accept: application/json" -H "Content-type: application/json" -X POST
(Please follow the doc [4] thoroughly for more details]
[1] https://docs.google.com/file/d/0B056dKd2JQGJa0pFaU1BTDlEbFk/edit?usp=sharing
[2] https://docs.google.com/file/d/0B056dKd2JQGJUFdUN21GRGpzY0k/edit?usp=sharing
[3] https://docs.google.com/file/d/0B056dKd2JQGJa0pFaU1BTDlEbFk/edit?usp=sharing
[4] http://docs.wso2.org/wiki/display/CEP210/Build+Analyzer
Hope this will help you...
Regards,
Mohan