how to deal with comma and apostrophe chars when inserting into SQLite - c++

i have string that contains apostrophe and comma's and when i execute insert into SQLite
it gives me error for example with string like this :
...., 'The Smiths - I Know It's Over', .....
"Over": syntax error Unable to execute statement
how can i or what can i do to keep the apostrophe's in the string but preform valid insert?
im using Qt c++ .

You shouldn't be putting arbitrary strings directly into SQL - that's asking for an injection attack. Instead, use bound parameters; something like:
sqlite3_stmt * statement;
sqlite3_prepare(db, "select * from students where name=?", -1, &statement, NULL);
sqlite3_bind_text(statement, 1, "'; drop table students --", -1, SQLITE_STATIC);
sqlite3_step(statement);
sqlite3_finalize(statement);
This will replace the first parameter (?) in the query with the given string, with no danger of any ' character being interpreted as the end of the string, or of any part of the string being executed as SQL.

From the FAQ:
(14) How do I use a string literal that contains an embedded single-quote (') character?
The SQL standard specifies that single-quotes in strings are escaped by putting two single quotes in a row. SQL works like the Pascal programming language in the regard. SQLite follows this standard. Example:
INSERT INTO xyz VALUES('5 O''clock');

I believe the mysql library has a function named mysql_real_escape_string to escape the string. Also, you may use double quotes to surround the string or escape the apostrophe of your input string like this 'The Smiths - I Know It\'s Over',

Related

regular expression replace for SQL

I have to replace a string pattern in SQL with empty string, could anyone please suggest me?
Input String 'AC001,AD001,AE001,SA001,AE002,SD001'
Output String 'AE001,AE002
There are the 4 digit codes with first 2 characters "alphabets" and last two are digits. This is always a 4 digit code. And I have to replace all codes except the codes starting with "AE".
I can have 0 or more instances of "AE" codes in the string. The final output should be a formatted string "separated by commas" for multiple "AE" codes as mentioned above.
Here is one option calling regex_replace multiple times, eliminating the "not required" strings little by little in each iteration to arrive at the required output.
SELECT regexp_replace(
regexp_replace(
regexp_replace(
'AC001,AD001,AE001,SA001,AE002,SD001', '(?<!AE)\d{3},{0,1}', 'X','g'
),'..X','','g'
),',$','','g'
)
See Demo here
I would convert the list to an array, unnest that to rows then filter out those that should be kept and aggregate it back to a string:
select string_agg(t, ',')
from unnest(string_to_array('AC001,AD001,AE001,SA001,AE002,SD001',',') as x(t)
where x.t like 'AE%'; --<< only keep those
This is independent of the number of elements in the string and can easily be extended to support more complex conditions.
This is a good example why storing comma separated values in a single column is not such a good idea to begin with.

SAS search and replace the last word using regex [duplicate]

Match strings ending in certain character
I am trying to get create a new variable which indicates if a string ends with a certain character.
Below is what I have tried, but when this code is run, the variable ending_in_e is all zeros. I would expect that names like "Alice" and "Jane" would be matched by the code below, but they are not:
proc sql;
select *,
case
when prxmatch("/e$/",name) then 1
else 0
end as ending_in_e
from sashelp.class
;quit;
You should account for the fact that, in SAS, strings are of char type and spaces are added up to the string end if the actual value is shorter than the buffer.
Either trim the string:
prxmatch("/e$/",trim(name))
Or add a whitespace pattern:
prxmatch("/e\s*$/",name)
^^^
to match 0 or more whitespaces.
SAS character variables are fixed length. So you either need to trim the trailing spaces or include them in your regular expression.
Regular expressions are powerful, but they might be confusing to some. For such a simple pattern it might be clearer to use simpler functions.
proc print data=sashelp.class ;
where char(name,length(name))='e';
run;

Extract numbers from a field in PostgreSQL

I have a table with a column po_number of type varchar in Postgres 8.4. It stores alphanumeric values with some special characters. I want to ignore the characters [/alpha/?/$/encoding/.] and check if the column contains a number or not. If its a number then it needs to typecast as number or else pass null, as my output field po_number_new is a number field.
Below is the example:
SQL Fiddle.
I tired this statement:
select
(case when regexp_replace(po_number,'[^\w],.-+\?/','') then po_number::numeric
else null
end) as po_number_new from test
But I got an error for explicit cast:
Simply:
SELECT NULLIF(regexp_replace(po_number, '\D','','g'), '')::numeric AS result
FROM tbl;
\D being the class shorthand for "not a digit".
And you need the 4th parameter 'g' (for "globally") to replace all occurrences.
Details in the manual.
For a known, limited set of characters to replace, plain string manipulation functions like replace() or translate() are substantially cheaper. Regular expressions are just more versatile, and we want to eliminate everything but digits in this case. Related:
Regex remove all occurrences of multiple characters in a string
PostgreSQL SELECT only alpha characters on a row
Is there a regexp_replace equivalent for postgresql 7.4?
But why Postgres 8.4? Consider upgrading to a modern version.
Consider pitfalls for outdated versions:
Order varchar string as numeric
WARNING: nonstandard use of escape in a string literal
I think you want something like this:
select (case when regexp_replace(po_number, '[^\w],.-+\?/', '') ~ '^[0-9]+$'
then regexp_replace(po_number, '[^\w],.-+\?/', '')::numeric
end) as po_number_new
from test;
That is, you need to do the conversion on the string after replacement.
Note: This assumes that the "number" is just a string of digits.
The logic I would use to determine if the po_number field contains numeric digits is that its length should decrease when attempting to remove numeric digits.
If so, then all non numeric digits ([^\d]) should be removed from the po_number column. Otherwise, NULL should be returned.
select case when char_length(regexp_replace(po_number, '\d', '', 'g')) < char_length(po_number)
then regexp_replace(po_number, '[^0-9]', '', 'g')
else null
end as po_number_new
from test
If you want to extract floating numbers try to use this:
SELECT NULLIF(regexp_replace(po_number, '[^\.\d]','','g'), '')::numeric AS result FROM tbl;
It's the same as Erwin Brandstetter answer but with different expression:
[^...] - match any character except a list of excluded characters, put the excluded charaters instead of ...
\. - point character (also you can change it to , char)
\d - digit character
Since version 12 - that's 2 years + 4 months ago at the time of writing (but after the last edit that I can see on the accepted answer), you could use a GENERATED FIELD to do this quite easily on a one-time basis rather than having to calculate it each time you wish to SELECT a new po_number.
Furthermore, you can use the TRANSLATE function to extract your digits which is less expensive than the REGEXP_REPLACE solution proposed by #ErwinBrandstetter!
I would do this as follows (all of the code below is available on the fiddle here):
CREATE TABLE s
(
num TEXT,
new_num INTEGER GENERATED ALWAYS AS
(NULLIF(TRANSLATE(num, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ. ', ''), '')::INTEGER) STORED
);
You can add to the 'ABCDEFG... string in the TRANSLATE function as appropriate - I have decimal point (.) and a space ( ) at the end - you may wish to have more characters there depending on your input!
And checking:
INSERT INTO s VALUES ('2'), (''), (NULL), (' ');
INSERT INTO t VALUES ('2'), (''), (NULL), (' ');
SELECT * FROM s;
SELECT * FROM t;
Result (same for both):
num new_num
2 2
NULL
NULL
NULL
So, I wanted to check how efficient my solution was, so I ran the following test inserting 10,000 records into both tables s and t as follows (from here):
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
INSERT INTO t
with symbols(characters) as
(
VALUES ('ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789')
)
select string_agg(substr(characters, (random() * length(characters) + 1) :: INTEGER, 1), '')
from symbols
join generate_series(1,10) as word(chr_idx) on 1 = 1 -- word length
join generate_series(1,10000) as words(idx) on 1 = 1 -- # of words
group by idx;
The differences weren't that huge but the regex solution was consistently slower by about 25% - even changing the order of the tables undergoing the INSERTs.
However, where the TRANSLATE solution really shines is when doing a "raw" SELECT as follows:
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT
NULLIF(TRANSLATE(num, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ. ', ''), '')::INTEGER
FROM s;
and the same for the REGEXP_REPLACE solution.
The differences were very marked, the TRANSLATE taking approx. 25% of the time of the other function. Finally, in the interests of fairness, I also did this for both tables:
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT
num, new_num
FROM t;
Both extremely quick and identical!

finding if a string contains number using regexp_replace

I want to find if a string is a pure string using regexp_replace only.
If it is not a pure string, then designate the string as XX;
Else, use the string.
For eg:
If a string is 'A1', then since there is a number, it is not a pure string and the outout would be XX.
If the string is AB, there is no number or anything other than an aplha, then use AB.
Note: This needs to be done in a wierd requirement only with regexp_replace.
I know how this is done using regexp_like or translate etc. But, i would like to do it with regexp_replace only.
I don't know about Oracle regexp_replace method but a 'generic' regular expression would be:
regex = [a-z][A-Z] // Only alphabetical characters , upper case or lower case
Thanks to all. With some R&D, I was able to find the solution which works as well.
select case when length('A1') = length(trim(regexp_replace('A1','[[:digit:]]',' '))) then 'string'
else 'alphanumeric'
end from dual
create table regexp_replace_example(test_string varchar2(100) not null);
insert into regexp_replace_example(test_string) values ('A1');
insert into regexp_replace_example(test_string) values ('AB');
select test_string, case
when regexp_replace(test_string,'[^a-zA-Z]') != test_string then 'XX'
else test_string
end as output_value
from regexp_replace_example;
drop table regexp_replace_example;

Special Chars to ASCII

Using VBA in MS Access, is there a way to have all special chars in a string replaced with the ASCII equivalent? In other words, I want the ampersands gone and replaced with &, along with every other special character.
A PHP equivalent is HTMLSpecialChars. I have semi-colons in my inserts that are probably blowing up my query. I need semi-colons converted to clean my text for an insert.
Starting with Access 2000 the Replace Function is available in Access VBA.
? Replace("a&v", "&", "&")
a&b
You would need to repeat that function pattern for any other characters you want to replace.
However, if this is intended to prevent blowing up an INSERT statement, it may be a red herring. You should be able to insert text which contains semi-colons or ampersands into a text field as long as the text you insert is properly quoted or is supplied as a parameter to a parameter query. Both these statements execute successfully for me.
CurrentDb.Execute "INSERT INTO MyTable (MyText) " & _
"VALUES ('a&b')"
CurrentDb.Execute "INSERT INTO MyTable (MyText) " & _
    "VALUES ('a;b')"
It may help to show us the SQL for your failing INSERT statement with a simple example of the text which causes it to blow up. Also tell us the error message, if any. Please paste the SQL into your question rather than into a comment.
http://www.renownedmedia.com/blog/convert-ascii-to-utf-8-using-vba/