regular expression to extract insert sql statement from a text file and to check for hardcoded parameters - regex

I have a bunch of sql statements updated by my team developers.
I intend to run a check before these statements are run against a db.
for example, check if a certain column is hardcoded instead of being fetched from the respective table (foreign key)
for example:
INSERT INTO [Term1] ([CreatedBy]
,[CreateUser]) values(1,'asdadad')
where 1 is hardcoded value.
Is there a regular expression that can extract all insert statements from the file so that they can be parse?
I tried with this expression http://regexlib.com/REDetails.aspx?regexp_id=1750 but it didnot work

You may need to run a multi-level regex on this. First parse the entire parameter string from the whole query, then parse each individual field from the paramter string that you previously got to get each one specifically ignoring all the other characters that may come up.

Related

RegEx to get unique values from large file with duplicates

I have a large XML-file that I want to extract unique values from. The values I'm looking for are placed in the XML-tag: ns3:order_id
To make it more complex, the file contains duplicates of order_id, and I'm only interested in geeting the unique order_id values.
I've been using RegEx to extract the values, this is the expression:
(?sm)(\<ns3:order_id>\d+\b)(?!.*\1\b)
The expression gives me what I need, BUT only if the file is way smaller. When I try this expression on the "big" file I receive: "Catastrophic backtracking has been detected and the execution of your expression has been halted." I guess it has with *, and I have tried different ways replacing it without success.
Is there any way to correct my expression so that I can collect the values?
As seen in the text above, I've tried several diffrent RegEx ways. The expression above works, but not in bigger files.

Google sheet Regex

Trying to fetch meaning of an entered text from urban dictionary. The problem is that urban dictionary shows several definitions posted by different users. I've used 'importxml' for fetching the first page that shows up when someone searches for a particular word.
Now I want this data to be split in different columns so that I can get each definition in seperate column.
If we look at the fetched data, at the end of every definition there is "by username month dd,yyyy" string.
How can I use this string to split that raw data into definitions in separate columns?
Tried RegEx but could not figure it out because this is the first time I'm using Regex.
replace string to unique symbol and then split by it
to capture string use the pattern:
"by username .+ \d+,\d{4}"
As you can read here, regex is not the correct tool for parsing HTML.
In your situation I will use Google Apps Script in combination with a DOMParser library, as cheerio.
Example:
const content = getContent_('https://www.urbandictionary.com/define.php?term=nah');
const $ = Cheerio.load(content);
Logger.log($('.contributor').text());

Use multiple replace conditions for a single column in Amazon Redshift

I have a table where the amount column has , and $ sign for example: $8,122.14 as values. I want to write a replace function to replace $ and , over that column in one go. Is there any way we can write multiple conditions in one replace in Redshift? Also, this is apart of post processing the data where I am inserting data from stage table to a final table after replacing these values.
I tried the ways listed in the take 1 and 2 given in the code but both of them failed.
Take 1:
insert into db.stage_table
select
(coalesce(replace(logging_amount,'$',','),''))) as logging_amount
from db.table;
Take 2:
insert into db.stage_table
select
(coalesce(replace(logging_amount,'$',',')) as logging_amount
from db.table;
Both of them failed.
The expected result should be replace function in a single statement.
Yes you can nest replace statements like this
replace(replace(logging_amount,'$',''),',','')
Or you can use regex if you prefer (personally for something like this i think nested replaces are easier to read.)

Possible combination (variations) of words in a string variable in stata

I have a string variable containing school names and I need to find all the possible combination of each word in this string variable in stata:
For example variation of a word "Academy" would be:
Academy,
Academy,
acdamey,
aacdemy,
dmcaamy,
aacedmy,
and so on.
I need this to standardize the raw data of school names, which has many typos of each word due to data entry issues, like the ones given above for "academy".
Depending whether your data is already in the Excel sheets or a file, you can either use regex trying to match all possible combinations (and probably fix them when found) or parse the strings first before bringing them into Excel. In either case you could make a file (or Excel list/table/area/etc.) that includes all the common typos and pick each typo as regex match to use when comparing to your actual input.
Making regexp that would actually find all possible cases is next to impossible, especially if there are cases where very similar (but correct) names for schools exist. In any case direct regexps would be very messy and complex, so I would advice you to parse the data by finding first the correct form, excluding it and then using (greedy) search/regex to find the typoed versions. You can then save the typos to use them as a filter/match/pattern.
To get some sort of starting ideas, check this links:
Regex: Search for verb roots
Read text file and extract string into Excel sheet using regex
P.s You should keep the count of all strings/school names and finally get a list of all names that did not match correct form or any of your regexp filters, so you can manually insert/correct them.

regular expression or replace function in where clause of a mysql query

I write a mysql query
select * from table where name like '%salil%'
which works fine but it will no return records with name 'sal-il', 'sa#lil'.
So i want a query something like below
select * from table whereremove_special_character_from(name)like '%salil%'
remove_special_character_from(name) is a mysql method or a regular expression which remove all the special characters from name before like executed.
No, mysql doesn't support regexp based replace.
I'd suggest to use normalized versions of the search terms, stored in the separate fields.
So, at insert time you strip all non-alpha characters from the data and store it in the data_norm field for the future searches.
Since I know no way to do this, I'd use a "calculated column" for this, i.e. a column which depends on the value of name but without the special characters. This way, the cost for the transformation is paid only once and you can even create an index on the new column.
See this answer how to do this.
I agree with Aaron and Col. Shrapnel that you should use an extra column on the table e.g. search_name to store a normalised version of the name.
I noticed that this question was originally tagged ruby-on-rails. If this is part of a Rails application then you can use a before_save callback to set the value of this field.
In MYSQL 5.1 you can use REGEXP to do regular expression matching like this
SELECT * FROM foo WHERE bar REGEXP "baz"
see http://dev.mysql.com/doc/refman/5.1/en/regexp.html
However, take note that it will be slow and you should do what others posters suggested and store the clean value in a separate field.