Django's raw SQL feature adds single quote marks around any parameters you pass into the SQL query, if the parameters are strings.
This breaks for me when I need to do a query like this:
SELECT * FROM table WHERE id IN (%s)
The param is a string, such as '1,2,3', so Django renders the query as:
SELECT * FROM table WHERE id IN ('1,2,3')
Those quotes around the param break the query.
It seems to me Django is forcing me to use string interpolation (ie injecting the params into the string before it's used in the raw query), yet the docs clearly state we should not do that.
From the docs:
Using the params argument completely protects you from SQL injection
attacks, a common exploit where attackers inject arbitrary SQL into
your database. If you use string interpolation, sooner or later you’ll
fall victim to SQL injection. As long as you remember to always use
the params argument you’ll be protected.
Is there a way to turn OFF the quotes? If we want quotes we can just add them, no? (eg WHERE name = '%s')
The Django raw query takes in parameters as a single list or dictionary. So in this case, you should be invoking your raw query like this:
YourModel.objects.raw('SELECT * FROM table WHERE id in %s', [(1, 2, 3, 4, 5, ...)])
Related
I'm thinking of using a raw query to quickly get around limitations with either my brain or the Django ORM, but I don't want to redevelop the infrastructure required to support the existing ORM code such as filters. Right now I'm stuck with two dead ends:
Writing an inner raw query and reusing that like any other query set. Even though my raw query selects the correct columns, I can't filter on it:
AttributeError: 'RawQuerySet' object has no attribute 'filter'
This is corroborated by another answer, but I'm still hoping that that information is out of date.
Getting the SQL and parameters from the query set and wrapping that in a raw query. It seems the raw SQL should be retrievable using queryset.query.get_compiler(DEFAULT_DB_ALIAS).as_sql() - how would I get the parameters as well (obviously without actually running the query)?
One option for dealing with complex queries is to write a VIEW that encapsulates the query, and then stick a model in front of that. You will still be able to filter (and depending upon your view, you may even get push-down of parameters to improve query performance).
All you need to do to get a model that is backed by a view is have it as "unmanaged", and then have the view created by a migration operation.
It's better to try to write a QuerySet if you can, but at times it is not possible (because you are using something that cannot be expressed using the ORM, for instance, or you need to to something like a LATERAL JOIN).
Using AWS Cloudsearch, I need to query 2 separate fields for the same value using a structured (compound) query e.g.
(and (or name:'john smith') (or curr_addr:'123 someplace' other_addr:'123 someplace'))
This query works, but I'm wondering if it's necessary to repeat the value for each field that I want to search against. Is there some way to specify the value only once e.g. curr_addr+other_addr:'123 someplace'
That is the correct way to structure your compound query. From the AWS documentation, you'll see that they structure their example query the same way:
(and title:'star' (or actors:'Harrison Ford' actors:'William Shatner')(not actors:'Zachary Quinto'))
From Constructing Compound Queries
You may be able to get around this by listing the more repetitive fields in the query options (q.options), and then specify the field for the rest of the fields. The fields list is sort of a fallback for when you don't specify which field you are searching in a compound query. So if you list the address fields there, and then only specify the name field in your query, you may get close to the behavior you're looking for.
Query options
q.options={fields: ['curr_addr','other_addr']}
Query
(and (or name:'john smith') (or '123 someplace'))
But this approach would only work for one set of repetitive fields, so it's not a silver bullet by any means.
From Search API Reference (see q.options => fields)
To protect against sql injection, I read in the introduction to ColdFusion that we are to use the cfqueryparam tag.
But when using stored procedures, I am passing my variables to corresponding variable declarations in SQL Server:
DROP PROC Usr.[Save]
GO
CREATE PROC Usr.[Save]
(#UsrID Int
,#UsrName varchar(max)
) AS
UPDATE Usr
SET UsrName = #UsrName
WHERE UsrID=#UsrID
exec Usr.[get] #UsrID
Q: Is there any value in including cfSqlType when I call a stored procedure?
Here's how I'm currently doing it in Lucee:
storedproc procedure='Usr.[Save]' {
procparam value=Val(form.UsrID);
procparam value=form.UsrName;
procresult name='Usr';
}
This question came up indirectly on another thread. That thread was about query parameters, but the same issues apply to procedures. To summarize, yes you should always type query and proc parameters. Paraphrasing the other answer:
Since cfsqltype is optional, its importance is often underestimated:
Validation:
ColdFusion uses the selected cfsqltype (date, number, etcetera) to validate the "value". This occurs before any sql is ever sent to
the database. So if the "value" is invalid, like "ABC" for type
cf_sql_integer, you do not waste a database call on sql that was never
going to work anyway. When you omit the cfsqltype, everything is
submitted as a string and you lose the extra validation.
Accuracy:
Using an incorrect type may cause CF to submit the wrong value to the database. Selecting the proper cfsqltype ensures you are
sending the correct value - and - sending it in a non-ambiguous format
the database will interpret the way you expect.
Again, technically you can omit the cfsqltype. However, that
means CF will send everything to the database as a string.
Consequently, the database will perform implicit conversion
(usually undesirable). With implicit conversion, the interpretation
of the strings is left entirely up to the database - and it might
not always come up with the answer you would expect.
Submitting dates as strings, rather than date objects, is a
prime example. How will your database interpret a date string like
"05/04/2014"? As April 5th or a May 4th? Well, it depends. Change the
database or the database settings and the result may be completely
different.
The only way to ensure consistent results is to specify the
appropriate cfsqltype. It should match the data type of the target
column/function (or at least an equivalent type).
We are developing the module in tryton based on GNU Health.We got the following error :
ProgrammingError: operator does not exist character varying = bigint
Hint: No opreator matches the given name and argument type(s). You might need to add explicit type casts
As best as I can vaguely guess from the limited information provided, in this query:
"SELECT name,age,dob,address FROM TABLENAME WHERE pmrn=%s" % (self.pmrn)
you appear to be doing a string substitution of a value into a query.
First, this is dangerously wrong, and you should never ever do it without an extremely good reason. Always use parameterized queries. psycopg2 supports these, so there's no excuse not to. So do all the other Python interfaces for PostgreSQL, but I'm assuming you're using psycopg2 because basically everyone does, so go read the usage documentation to see how to pass query parameters.
Second, as a result of failing to use parameterized queries, you aren't getting any help from the database driver with datatype handling. You mentioned that pmrn is of type char - for which I assume you really meant varchar; if it's actually char then the database designers need to be taken aside for a firm talking-to. Anyway, if you substitute an unquoted number in there your query is going to look like:
pmrn = 201401270001
and if pmrn is varchar that'll be an error, because you can't compare a text type to a number directly. You must pass the value as text. The simplistic way is to put quotes around it:
pmrn = '201401270001'
but what you should be doing instead is letting psycopg2 take care of all this for you by using parameterized queries. E.g.
curs.execute("SELECT name,age,dob,address FROM TABLENAME WHERE pmrn=%s", (self.pmrn,))
i.e. pass the SQL query as a string, then a 1-tuple containing the query params. (You might have to convert self.pmrn to str if it's an int, too, eg str(self.pmrn)).
I'm wondering if the # symbol is enough.
This is a part of the sql command that I'm using
WHERE login='#FORM.login#' AND password COLLATE Latin1_General_CS_AS = '#FORM.password#'
I'm trying to test it with user names such as ' OR 1=1 and variants of it, but even though it's not working I don't want to have a false sense of security.
I've read that using <cfqueryparam> can prevent this form of attack, are there any other ways?
The way to go is <cfqueryparam>. It's simple, straight-forward, datatype-safe, can handle lists (for use with IN (...)) and can handle conditional NULLs. Plus you get a benefit out of it in loops - the query text itself is sent to the server only once, with each further loop iteration only parameter values are transferred.
You can use '#var#' and be relatively safe. In the context of a <cfquery> tag ColdFusion will expand the value of var with single quotes escaped, so there is some kind of automatic defense against SQL injection. But beware: This will — by design — not happen with function return values: For example, in '#Trim(var)#' single quotes won't be escaped. This is easily overlooked and therefore dangerous.
Also, it has a disadvantage when run in a loop: Since variable interpolation happens before the SQL is sent to the server, ColdFusion will generate a new query text with every iteration of a loop. This means more bytes over the wire and no query plan caching on the server, as every query text is different.
In short: Use <cfqueryparam> wherever you can:
WHERE
login = <cfqueryparam value="#FORM.login#" cfsqltype="CF_SQL_VARCHAR">
AND password = <cfqueryparam value='#Hash(FORM.password, "SHA-512")#' cfsqltype="CF_SQL_VARCHAR">
Instead of a simple Hash(), you should indeed use a salted hash, as #SLaks pointed out in his comment.
An even better way to go would be to use stored procedures for everything.