How could I keep the synchronization within elasticsearch-rails - ruby-on-rails-4

I have a existing data in mysql.
I imported all records from mysql into elasticsearch with a rake task
However, I wonder know how could I keep synchronization when I delete, update, insert a record within Rails
How could Rails trigger the modification to elasticsearch? Thanks
model
require 'elasticsearch/model'
class Job < ActiveRecord::Base
attr_accessor :deadline_asap
include Elasticsearch::Model
include Elasticsearch::Model::Callbacks
end
Rake task
For initial ES indexing
task :import_all => :environment do
Job.__elasticsearch__.create_index!
end

It should already be synchronized:
adding
include Elasticsearch::Model::Callbacks
to your model should ensure that any modification will call the ElasticSearch API
To check this out, just modify a model and use your ES search, it should give you results including the frechly updated model.
Update: Here is the Automatic Callbacks documentation in ElasticSearch-Model gem

Related

AWS Amplify filter for #searchable annotation

Currently I am using a DynamoDB instance for my social media application. While designing the schema I sticked to the "one table" rule. So I am putting every data in the same table like posts, users, comments etc. Now I want to make flexible queries for my data. Here I found out that I could use the #searchable annotation to create an Elastic Search instance for a table which is annotated with #model
In my GraphQL schema I only have one #model, since I only have one table. My problem now is that I don't want to make everything in the table searchable, since that would be most likely very expensive. There are some data which don't have to be added to the Elastic Search instance (For example comment related data). How could I handle it ? Do I really have to split my schema down into multiple tables to be able to manage the #searchable annotation ? Couldn't I decide If the row should be stored to the Elastic Search with help of the Partitionkey / Primarykey, acting like a filter ?
The current implementation of the amplify-cli uses a predefined python Lambda that are added once we add the #searchable directive to one of our models.
The Lambda code can not be edited and currently, there is no option to define a custom Lambda, you read about it
https://github.com/aws-amplify/amplify-cli/issues/1113
https://github.com/aws-amplify/amplify-cli/issues/1022
If you want a custom Lambda where you can filter what goes to the Elasticsearch Instance, you can follow the steps described here https://github.com/aws-amplify/amplify-cli/issues/1113#issuecomment-476193632
The closest you can get is by creating a template in amplify\backend\api\myapiname\stacks\ where you can manage all the resources related to Elasticsearch. A good start point is to
Add #searchable to one of your model in the schema.grapql
Run amplify api gql-compile
Copy the generated template in the build folder, \amplify\backend\api\myapiname\build\stacks\SearchableStack.json to amplify\backend\api\myapiname\stacks\
Remove the #searchable directive from the model added in step 1
Start editing your new template copied in step 3
Add a Lambda and use it in the template as the resolver for the DynamoDB Stream
Using this approach will give you total control of the resources related to the Elasticsearch service, but, will also require to do it all by your own.
Or, just go by creating a table for each model.
Hope it helps
It is now possible to override the generated streaming function code as well.
thanks to the AWS Support for the information provided
leaved a message on the related github issue as well https://github.com/aws-amplify/amplify-category-api/issues/437#issuecomment-1351556948
All you need is to run
amplify override api
edit the corresponding overrode.ts
change the code with the resources.opensearch.OpenSearchStreamingLambdaFunction.code
resources.opensearch.OpenSearchStreamingLambdaFunction.functionName = 'python_streaming_function';
resources.opensearch.OpenSearchStreamingLambdaFunction.handler = 'index.lambda_handler';
resources.opensearch.OpenSearchStreamingLambdaFunction.code = {
zipFile: `
# python streaming function customized code goes here
`
}
Resources:
[1] https://docs.amplify.aws/cli/graphql/override/#customize-amplify-generated-resources-for-searchable-opensearch-directive
[2]AWS::Lambda::Function Code - Properties - https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-lambda-function-code.html#aws-properties-lambda-function-code-properties

How to get value of field <id> from Neo4j database by neo4j.rb in Rails?

Which method has neo4j.rb to get value of <id> field from Neo4j database?
It depends what you mean ;)
The Neo4j.rb project has two main gems: neo4j and neo4j-core. If you're using the neo4j gem to create ActiveNode / ActiveRel objects in Neo4j, you can either use the node_object.id / node_object.uuid methods to get the UUID generated by the gem or the node_object.neo_id method to get the Neo ID generated by the database. The gem generates UUIDs because Neo4j internal IDs can be recycled and so don't offer the best option for external references to nodes. Keep in mind, though, that if you use the id_property method in a model this will change the behavior of IDs generated by the gem.
The neo4j-core gem allows you to make Cypher queries directly and from there you can get whatever you want. An example:
# Gets the Neo4j internal ID
Neo4j::Session.current.query("MATCH (n) WHERE n.name = 'Sue' RETURN ID(n)")

Elasticsearch self.published?

I am using elasticsearch-rails gem For my site i need to create custom callbacks. https://github.com/elastic/elasticsearch-rails/tree/master/elasticsearch-model#custom-callbacks
But i really confused by one thing. What means if self.published? on this code?
i try to use this for my models
after_commit on: [:update] do
place.__elasticsearch__.update_document if self.published?
end
but for model in console i see self.published? => false but i don`t know what this means
From the document of elasticsearch-rails.
For ActiveRecord-based models, use the after_commit callback to protect your data against inconsistencies caused by transaction rollbacks:
I think it was used to make sure everything is updated successfully into database before we sync to elasticsearch server

Rails 4: run migrations as separate DB user

The situation I have is our normal Rails DB user has full ownership in order to run migrations.
However, we use a shared DB for development, so we can't run "destructive" DB tasks against the development DB, such as rake db:drop/reset/etc....
My thought is to create 2 DB users:
rails-service
rails-migrator
The service user is the "normal" web app user that connects to the DB when the app is live. This DB user would only have standard CRUD privileges but no dropping rights.
The migrator user is the "admin" user that is only used for running migrations. This DB user would have normal "full" access to the DB such that it "could" drop the DB if that command were executed.
Question: Is there a clean way to tell Rails migrations to run as the rails-migrator user? I'm not sure how I would accomplish this aside from somehow altering the connection strings for every rails migration file, which seems like a bad idea.
In tandem with the above, I'm going to "delete" the destructive rake tasks so that a developer can't even run them.
# lib/tasks/db.rake
# See: https://coderwall.com/p/jt4e1q/disable-destructive-rake-tasks-by-environment
tasks = Rake.application.instance_variable_get '#tasks'
tasks.delete 'db:reset'
tasks.delete 'db:drop'
namespace :db do
desc 'db:reset not available in this environment'
task :reset do
puts 'db:reset has been disabled'
end
desc 'db:drop not available in this environment'
task :drop do
puts 'db:drop has been disabled'
end
end
I refer you to the answer of Matthew Rudy Jacobs from 2007 (!) https://www.ruby-forum.com/topic/123618
Lucky enough it works also now :)
I just changed DEFINED? and the rest to ENV['AS_DB_ADMIN'] and used it to separate migration access to another user.
On migration I used
set :default_env, { as_db_admin: true }

Rails Expire action using regex expression with Memcache

Iam working on a Rails application and have integrated caching with memcache using Dalli. Iam working on action caching and expiring cache using sweepers.
My sweeper code looks like this:
class BoardSweeper < ActionController::Caching::Sweeper
observe Board
def after_create(board)
expire_cache(board)
end
def expire_cache(board)
expire_action :controller => :boards, :action => :show, :id => "#{board.id}-#{board.title.parameterize}"
end
end
But I want to delete the cache using regex expression i.e. I want to match the url and delete the cache just like:
If my board show url's are like:
"boards/1/likes-collection-of-branded-products.text/javascript"
"boards/1/likes-collection-of-branded-products.text/html"
Then I want to use the following expression to expire the cache:
Rails.cache.delete_matched(/boards\/1.*/)
But as per the memcache api doc it doesnt support delete_matched method.
Iam sure there should be some way to delete on basis of regex. Please help.
Many Thanks!!
afaik the problem with memcached is that there is no simple way of retrieving all keys. that's why there is no such functionality as expiring based on a regular expression.
what you could do:
use a naming convention for your cache keys and simply expire all cache keys that you know of, that might have been created.
this would impose a little overhead by expiring keys that have not been created.
overall, i would not advise you using action-caches. there are good reasons that those got excluded from rails4.
You could try Gibson cache server as your primary cache store, it supports multiple keys invalidation by prefix in O(1) time due to its internal data structure, in your case it would be just a matter of:
Rails.cache.delete_matched "boards/1/"
An ActiveSupport extension is being developed and will be released in about a week.