crate.io FULLTEXT SEARCH fuzziness - levenshtein-distance

I would like to use Levenshtein and Im looking for some examples.
I already read the documentation, but I dont know how to implement it.
I tried to build my own Analyzer, but it crashed everytime I used it.
Here is the documentation I follwed:
https://crate.io/docs/reference/sql/fulltext.html
Example table:
CREATE TABLE IF NOT EXISTS "test"."accounts" (
"customer_name" STRING INDEX USING FULLTEXT WITH (
analyzer = 'standard'
),
"customer_no" STRING,
PRIMARY KEY ("customer_no")
)
INSERT INTO "test"."accounts"
(customer_name, customer_no)
VALUES('Walmart','C00001');
My goal will be to search for Wal-mart and return Walmart.

The standard analyzer you using for this example would split the search word ‘wal-mart’ (because of the hyphen) into two tokens - ‘wal’ and ‘mart’. Since this is probably not what you want for the described use case, i would recommend to add a custom analyzer such as:
create ANALYZER lowercase_keyword (
TOKENIZER keyword,
TOKEN_FILTERS (
lowercase
)
);
This will index the word as it is - except turning it into lowercase.
Then create a table with the new created analyzer and add some data:
CREATE TABLE IF NOT EXISTS "test"."accounts" (
"customer_name" STRING INDEX USING FULLTEXT WITH (
analyzer = 'lowercase_keyword'
),
"customer_no" STRING,
PRIMARY KEY ("customer_no")
);
INSERT INTO "test"."accounts" (customer_name, customer_no) VALUES ('Walmart', 'C00001'), ('Wal-mart', 'C00002'), ('wal-mart', 'C00003'), ('wal- mart’, ’C00004');
Now the query given below returns ‘Walmart’, ‘Wal-mart’ and ‘wal-mart’:
select customer_name from test.accounts where match(customer_name, 'walmart') using best_fields with (fuzziness=1);
With a fuzziness of 2 the query would additionaly return ‘wal- mart’.

Related

How to search a string in an Oracle Apex page having multiple reports?

Suppose there are 3 Reports in an Oracle Apex page and I want to search all the columns of all the reports to get the matched string that the user will search. Any ideas?
Create your own search item; let's call it P1_SEARCH. Modify all reports' queries and add condition into their WHERE clauses, e.g.
select ...
from ...
where ...
-- add this:
and ( (column_1 = :P1_SEARCH or
column_2 = :P1_SEARCH or
...
)
or :P1_SEARCH is null
)
Columns you'd use should make sense (i.e. there's no use in searching DATE datatype columns for "Ashi", is there?).
Looking at what you should do (using that suggestion, of course), I'd say that it is simpler to use Apex engine and search 3 times, report-by-report.

Error in KeyConditionExpression when using contains on partition key

I have Tags as partition key in my table, and when I am trying to query I am getting AttributeError.
Below is my code:
kb_table = boto3.resource('dynamodb').Table('table_name')
result = kb_table.query(
KeyConditionExpression=Key('Tags').contains('search term')
)
return result['Items']
Error:
"errorMessage": "'Key' object has no attribute 'contains'"
Basically I want to search through the table where I the field is having that search term. I have achived it using scan but I have read everywhere that we should not use that.
result = kb_table.scan(
FilterExpression="contains (Tags, :titleVal)",
ExpressionAttributeValues={ ":titleVal": "search term" }
)
So I have changed my partition-key to Tags along with a sort-key so that I can achieve this using query but now I am getting this error.
Any idea how to get this working?
In order to use Query you must specify one partition to access, you cannot wildcard a partition or specify multiple keys.
KeyConditionExpression
The condition that specifies the key value(s)
for items to be retrieved by the Query action.
The condition must perform an equality test on a single partition key
value.
Assuming you want to search the whole table for tags, a scan is the most appropriate approach.
EDIT: You can use Query with the exact search term, but im guessing that is not what you want.
kb_table = boto3.resource('dynamodb').Table('table_name')
result = kb_table.query(
KeyConditionExpression=Key('Tags').eq('search term')
)
return result['Items']

How to add a string on a specific string by using regex_replace method in Oracle

I am trying to add a string '_$' to a index name and a table name as follows. I need to use a method 'regexp_replace' in SELECT statement.
select regexp_replace(input_string......)
# Input
CREATE UNIQUE INDEX "SCOTT"."PK_EMP" ON "SCOTT"."EMP" ("EMP_NO")
# Desired Output
CREATE UNIQUE INDEX "SCOTT"."PK_EMP_$" ON "SCOTT"."EMP_$" ("EMP_NO")
Can you help me to build a regular expression for that?
Fairly brute solution would be using the following pattern:
(.*)(" ON ".*)(" \(.*)
with the following replace string:
\1_$\2_$\3
The pattern works by splitting the input in the places where you need to insert the _$ token, and then joining it back placing the tokens in the places we split the input:
CREATE UNIQUE INDEX "SCOTT"."PK_EMP|" ON "SCOTT"."EMP|" ("EMP_NO")
Full SELECT query would look like that:
SELECT REGEXP_REPLACE(
'CREATE UNIQUE INDEX "SCOTT"."PK_EMP" ON "SCOTT"."EMP" ("EMP_NO")',
'(.*)(" ON ".*)(" \(.*)',
'\1_$\2_$\3'
) RX
FROM dual;

REGEX: get Create table queries in sql dump

I have an sql dump of different tables each with different amount of fields, I want to insert a query after each one, so I'm trying to find a regex statment that would retreive:
CREATE TABLE cms_audit (
aud_id bigint NOT NULL IDENTITY(1,1),
user_id int DEFAULT NULL,
client_id int NOT NULL,
aud_event varchar(500) NOT NULL,
aud_type varchar(150) NOT NULL,
aud_string varchar(1000) DEFAULT NULL,
aud_date datetime DEFAULT NULL
)
-- --------------------------------------------------------
My regex is CREATE TABLE .*-- (in notepad++ I've checked the box that say's . matches newline) in my head this means get all that starts with "create table" and whatever is after it until you reach "--".
However this statement is retrieving the entire file instead of getting each "create table" query separately, what am I missing?.
I have also tried CREATE TABLE (.*|\n)*--.. didn't work.
You need to use a regex with any character except --. To achieve this you can do:
CREATE TABLE (?:(?!--).)*
EDIT
The ?! is to make a Negative Lookahead of the string --. Nothing with this string will match this expression.
You can see and test it with this link (it's very well explained and a good tool):
https://regex101.com/r/mR9fD4/1

How to format the id column with SHA1 digests in Rails application?

Without saving SHA1 digest string in table directly. Is it possible to format the column in select statement ?
For example (Hope you know what i mean):
#item = Item.where(Digest::SHA1.hexdigest id.to_s:'356a192b7913b04c54574d18c28d46e6395428ab')
No, not the way you want it. The hexdigest method you're using won't be available at the database level. You could use database-specific functions though.
For example:
Item.where("LOWER(name) = ?", entered_name.downcase)
The LOWER() function will be available to the database so it can pass the name column to it.
For your case, I can suggest two solutions:
Obviously, store the encrypted field in the table. And then match.
key = '356a192b7913b04c54574d18c28d46e6395428ab'
Item.where(encrypted_id: key)
Iterate over all column values (ID, in your case) and find the one that matches:
all_item_ids = Item.pluck("CAST(id AS TEXT)")
item_id = all_item_ids.find{ |val| Digest::SHA1.hexdigest(val) == key }
Then you could use Item.find(item_id) to get the item or Item.where(id: item_id) to get an ActiveRecord::Relation object.