SQL Server regex and if else in where clause - regex

This is my code:
$db = new COM("ADODB.Connection");
$dsn = "DRIVER={SQL Server}; SERVER={$server}; UID={$usr}; PWD={$pwd}; DATABASE={$dbname}";
$db->Open($dsn);
$sql = "SELECT o.CardCode, o.CardName, o.VatIDNum, o.AddID, o.Cellular, o.E_Mail, c.Address
FROM ocrd o INNER JOIN crd1 c ON o.CardCode = c.CardCode
WHERE o.Cellular = '$phone1' o.CardName LIKE N'%$lname%'";
$result = $db->execute($sql);
In the databese the o.Cellular column includes phone numbers that could be formatted with dashes/spaces/+ sign/braces so when I am checking WHERE o.Cellular = '$phone1' I need to reformat o.Cellular to only digits (the $phone1 already only digits).
The second problem is that if o.Cellular not equals $phone1, I want to check o.CardName LIKE N'%$lname%'.
So the current part of code doesn't works as I need.
Any help please...

REGEX is extremely limited in SQL Server. Here's an example to replace the following characters in your phone number: + ( ) - \s
SELECT
o.CardCode,
o.CardName,
o.VatIDNum,
o.AddID,
o.Cellular,
o.E_Mail,
c.Address
FROM
ocrd o
INNER JOIN
crd1 c ON
o.CardCode = c.CardCode
WHERE
replace(replace(replace(replace(replace(o.Cellular,'-',''),' ',''),'(',''),')',''),'+','') = '$phone1'
or o.CardName LIKE N'%$lname%'
--test example you can run to see the results of the code
declare #Cellular varchar(16) = '+1 (156) 555-7899'
select replace(replace(replace(replace(replace(#Cellular,'-',''),' ',''),'(',''),')',''),'+','')
--returns 1156555789

My own final solution:
$phone1 = "0123456789"; /* tests 0123456789 */
$phone1 = preg_replace("/\D+/", "", $phone); /* Leave digits only */
$phone1_wildcard = "%".implode("%",str_split($phone1))."%"; /* converts 0123456789 to %0%1%2%3%4%5%6%7%8%9% in order to catch any mistyping in mssql */

Related

How extract (changeable variable) word & number using regular expression matlab

I have more than 10k text files look similar like this, all of them are similar in format but not in size, sometime is bigger or smaller.
[{u'language': u'english', u'area': 3825.8953168044045, u'class': u'machine printed', u'utf8_string': u'troia', u'image_id': 428035, u'box': [426.42422762784093, 225.33333055900806, 75.15151515151516, 50.909090909090864], u'legibility': u'legible', u'id': 1056659}, {u'language': u'na', u'area': 24201.285583103767, u'id': 1056660, u'image_id': 428035, u'box': [223.99998520359847, 249.57575480143228, 172.12121212121215, 140.6060606060606], u'legibility': u'illegible', u'class': u'machine printed'}]
I want to extract two changeable variable in every text using regular expression.
The output should be like this
box = [223.99998520359847, 249.57575480143228, 172.12121212121215, 140.6060606060606]
box1 = .. sometime there is more than one
&
second output
word = troia
word1 = ... sometime there is more than one word
My code 1: for the word extraction
fid = fopen('text1.txt','r');
C = textscan(fid, '%s','Delimiter','');
fclose(fid);
C = C{:};
Lia = ~cellfun(#isempty, strfind(C,'utf8_string'));
output = [C{find(Lia)}];
expression = 'u''utf8_string'': u+'
matchStr = regexp(output, expression,'match');
My code 1 result give me only the
utf8_string
My code 2: for the box number extraction
s = sprintf('text_.txt');
fid = fopen(s);
tline = fgetl(fid);
C = regexp(tline,'u''box'': +\[([0-9\. ,]+)\]','tokens');
C = cellfun(#(x) x{1},C,'UniformOutput',false)';
M = cell2mat(cellfun(#(x) x', cat(1,C2{:}),'UniformOutput',false));
This code 2 is running but not with every text something i got this error
Error using cat Dimensions of matrices being concatenated are not consistent
If you do not insist on regexp: The input strings looks like json, so the following short code does even more than you want:
% Read the whole file
s = fileread('test.txt');
% Remove the odd u'
s = strrep(s, 'u''', '''');
% Replace ' by "
s = strrep(s, '''', '"');
% See http://www.mathworks.com/matlabcentral/fileexchange/20565
t = parse_json(s);
Now t a is cell object containing structs with the data. So
word = t{1}.utf8_string;
box = cell2mat(t{1}.box);
will give you the first word and box. If you have a newer Matlab version you can probably use jsondecode instead of parse_json.

Search for multiple string occurrence in String via PHP

I am working on an ecommerce website using MVC,php. I have a field called description. The user can enter multiple product id's in the description field.
For example {productID = 34}, {productID = 58}
I am trying to get all product ID's from this field. Just the product ID.
How do i go about this?
This solution doesn't use a capture group. Rather, it uses \K so that the full string elements become what would otherwise be captured using parentheses. This is a good practice because it reduces the array element count by 50%.
$description="{productID = 34}, {productID = 58}";
if(preg_match_all('/productID = \K\d+/',$description,$ids)){
var_export($ids[0]);
}
// output: array(0=>'34',1=>'58')
// \K in the regex means: keep text from this point
Without using regex, something like this should work for returning the string positions:
<code>
$html = "dddasdfdddasdffff";
$needle = "asdf";
$lastPos = 0;
$positions = array();
while (($lastPos = strpos($html, $needle, $lastPos))!== false) {
$positions[] = $lastPos;
$lastPos = $lastPos + strlen($needle);
}
// Displays 3 and 10
foreach ($positions as $value) {
echo $value ."<br />";
}
</code>

RegEx not working in MATLAB

I have not done any RegEx work in MATLAB, I do not think this is an environment issue but I am not sure. Here is my task:
Download NASDQ stock data from ftp://ftp.nasdaqtrader.com/symboldirectory/nasdaqtraded.txt
Extract all stock symbols using a RegEx
Here is the RegEx that I created: ^[A-Z]\|([A-Z]+)\|.+\|[A-Z]\|[A-Z]\|[A-Z]\|\d\d\d\|[A-Z]\|[A-Z]\|.*\|[A-Z]+$
This expression works on some, but not all lines in this file. For example, it works perfectly for this line:
- Y|AAPL|Apple Inc. - Common Stock|Q|Q|N|100|N|N||AAPL
However it does not match anything from this line:
- Y|A|Agilent Technologies, Inc. Common Stock|N| |N|100|N||A|A
- Y|AAMC|Altisource Asset Management Corp Com|A| |N|100|N||AAMC|AAMC
Help please...thanks!
Your file seems to be a set of columns delimited with |, with first line being column names.
Here is a solution to create directly structure array whose field names are obtained from column names:
function [structArray] = ReadNasdaqTraded(filename)
%[
% For debug
if (nargin < 1), filename = 'nasdaqtraded.txt'; end
% Read full file content
text = fileread(filename);
% Split on newline
text = strsplit(strtrim(text), '\n');
header = text{1}; % Keep header
content = text(2:(end-1)); % Keep content
footer = text{end}; %#ok - We don't care about last line (file creation date)
% Build suitable field names
fieldNames = strsplit(header, '|');
fieldNames = strtrim(fieldNames); % Remove any
fieldNames = strrep(fieldNames, ' ', ''); % spaces (TODO: OR special characters)
% Reformat content into cell matrix
count = length(content);
columnCount = length(fieldNames);
cellArray = cell(count, columnCount);
for ri = 1:count,
cellArray(ri, :) = strsplit(content{ri}, '|', 'CollapseDelimiters', false); % Carefull not to collapse empty delimiters
end
% Create structure array from cell content
structArray = cell2struct(cellArray, fieldNames, 2);
%]
It returns some result like this:
>> ReadNasdaqTraded('nasdaqtraded.txt')
ans =
8188x1 struct array with fields:
NasdaqTraded
Symbol
SecurityName
ListingExchange
MarketCategory
ETF
RoundLotSize
TestIssue
FinancialStatus
CQSSymbol
NASDAQSymbol
Easy to use then for whatever extra processing you need ...

Regex: How to remove English words from sentences using Regex?

I've number of rows in SQLite, each row has one column that contains data like this:
prosperکامیاب شدن ، موفق شدن ، رونق یافتن
As you can see, the sentence starts with English words, Now I want to remove English words at first of each sentence. Is there any way to do that via T-SQL query(using Regex)?
you may try this :) I have made it as a function to call upon
create function dbo.RemoveEngChars (#Unicode_string nvarchar(max))
returns nvarchar(max) as
begin
declare #i int = 1; -- must start from 1, as SubString is 1-based
declare #OriginalString nvarchar(100) = #Unicode_string collate SQL_Latin1_General_Cp1256_CS_AS
declare #ModifiedString nvarchar(100) = N'';
while #i <= Len(#OriginalString)
begin
if SubString(#OriginalString, #i, 1) not like '[a-Z]'
begin
set #ModifiedString = #ModifiedString + SubString(#OriginalString, #i, 1);
end
set #i = #i + 1;
end
return #ModifiedString
end
--To call the function , you can run the following script and pass the Unicode in N' prefix
select dbo.RemoveEngChars(N'prosperکامیاب شدن ، موفق شدن ، رونق یافتن')

Retrieve particular parts of string from a text file and save it in a new file in MATLAB

I am trying to retrieve particular parts of a string in a text file such as below and i would like to save them in a text file in MATLAB
Original text file
D 1m8ea_ 1m8e A: d.174.1.1 74583 cl=53931,cf=56511,sf=56512,fa=56513,dm=56514,sp=56515,px=74583
D 1m8eb_ 1m8e B: d.174.1.1 74584 cl=53931,cf=56511,sf=56512,fa=56513,dm=56514,sp=56515,px=74584
D 3e7ia1 3e7i A:77-496 d.174.1.1 158052 cl=53931,cf=56511,sf=56512,fa=56513,dm=56514,sp=56515,px=158052
D 3e7ib1 3e7i B:77-496 d.174.1.1 158053 cl=53931,cf=56511,sf=56512,fa=56513,dm=56514,sp=56515,px=158053
D 2bhja1 2bhj A:77-497 d.174.1.1 128533 cl=53931,cf=56511,sf=56512,fa=56513,dm=56514,sp=56515,px=128533
So basically, I would like to retrieve the pdbcodes id which are labeled as "1m8e", chainid labeled as "A" the Start values which is "77" and stop values which is "496" and i would like all of these values to be saved inside of a fprintf statment.
Is there some kind of method is which i can use in RegExp stating which index its all starting at and retrieve those strings based on the position in the text file for each line?
In the end, all i want to have in the fprinf statement is 1m8e, A, 77, 496.
So far i have two fopen function which reads a file and one that writes to a new file and to read each line by line, also a fprintf statment:
pdbcode = '';
chainid = '';
start = '';
stop = '';
fin = fopen('dir.cla.scop.txt_1.75.txt', 'r');
fout = fopen('output_scop.txt', 'w');
% TODO: Add error check!
while true
line = fgetl(fin); % Get the next line from the file
if ~ischar(line)
% End of file
break;
end
% Print result into output_cath.txt file
fprintf(fout, 'INSERT INTO cath_domains (scop_pdbcode, scop_chainid, scopbegin, scopend) VALUES("%s", %s, %s, %s);\n', pdbcode, chainid, start, stop);
Thank you.
You should be able to strsplit on whitespace, get the third ("1m8e") and fourth elements ("A:77-496"), then repeat the process on the fourth element using ":" as the split character, and then again on the second of those two arguments using "-" as the split character. That's one approach. For example, you could do:
% split on space and tab, and ignore empty tokens
tokens = strsplit(line, ' \t', true);
pdbcode = tokens(3);
% split fourth token from previous split on colon
tokens = strsplit(tokens(4), ':');
chainid = tokens(1);
% split second token from previous split on dash
tokens = strsplit(tokens(2), '-');
start = tokens(1);
stop = tokens(2);
If you really wanted to use regular expressions, you could try the following
pattern = '\S+\s+\S+\s+(\S+)\s+([A-Za-z]+):([0-9]+)-([0-9]+)';
[mat tok] = regexp(line, pattern, 'match', 'tokens');
pdbcode = cell2mat(tok)(1);
chainid = cell2mat(tok)(2);
start = cell2mat(tok)(3);
stop = cell2mat(tok)(4);