How I can encode/escape a varchar to be more secure without using cfqueryparam?

How I can encode/escape a varchar to be more secure without using cfqueryparam? - coldfusion

How I can encode/escape a varchar to be more secure without using cfqueryparam? I want to implement the same behaviour without using <cfqueryparam> to get around "Too many parameters were provided in this RPC request. The maximum is 2100" problem. See: http://www.bennadel.com/blog/1112-Incoming-Tabular-Data-Stream-Remote-Procedure-Call-Is-Incorrect.htm
Update:
I want the validation / security part, without generating a prepared-statement.
What's the strongest encode/escape I can do to a varchar inside <cfquery>?
Something similar to mysql_real_escape_string() maybe?

As others have said, that length-related error originates at a deeper level, not within the queryparam tag. And it offers some valuable protection and therefore exists for a reason.
You could always either insert those values into a temporary table and join against that one or use the list functions to split that huge list into several smaller lists which are then used separately.
SELECT name ,
..... ,
createDate
FROM somewhere
WHERE (someColumn IN (a,b,c,d,e)
OR someColumn IN (f,g,h,i,j)
OR someColumn IN (.........));

cfqueryparam performs multiple functions.
It verifies the datatype. If you say integer, it makes sure there is an integrer, and if not, it does nto allow it to pass
It separates the data of a SQL script from the executable code (this is where you get protection from SQL injection). Anything passed as a param cannot be executed.
It creates bind variables at the DB engine level to help improve performance.
That is how I understand cfqueryparam to work. Did you look into the option of making several small calls vs one large one?

It is a security issue. Stops SQL injections
Adobe recommends that you use the cfqueryparam tag within every cfquery tag, to help secure your databases from unauthorized users. For more information, see Security Bulletin ASB99-04, "Multiple SQL Statements in Dynamic Queries," at www.adobe.com/devnet/security/security_zone/asb99-04.html, and "Accessing and Retrieving Data" in the ColdFusion Developer's Guide.

The first thing I'd be asking myself is "how the heck did I end up with more than 2100 params in a single query?". Because that in itself should be a very very big red flag to you.
However if you're stuck with that (either due to it being outwith your control, or outwith your motivation levels to address ;-), then I'd consider:
the temporary table idea mentioned earlier
for values over a certain length just chop 'em in half and join 'em back together with a string concatenator, eg:
*
SELECT *
FROM tbl
WHERE col IN ('a', ';DROP DATABAS'+'E all_my_data', 'good', 'etc' [...])
That's a bit grim, but then again your entire query sounds grim, so that might not be such a concern.
param values that are over a certain length or have stop words in them or something. This is also quite a grim suggestion.
SERIOUSLY go back over your requirement and see if there's a way to not need 2100+ params. What is it you're actually needing to do that requires all this???

The problem does not reside with cfqueryparam, but with MsSQL itself :
Every SQL batch has to fit in the Batch Size Limit: 65,536 * Network Packet Size.
Maximum size for a SQL Server Query? IN clause? Is there a Better Approach
And
http://msdn.microsoft.com/en-us/library/ms143432.aspx

The few times that I have come across this problem I have been able to rewrite the query using subselects and/or table joins. I suggest trying to rewrite the query like this in order to avoid the parameter max.
If it is impossible to rewrite (e.g. all of the multiple parameters are coming from an external source) you will need to validate the data yourself. I have used the following regex in order to perform a safe validation:
<cfif ReFindNoCase("[^a-z0-9_\ \,\.]",arguments.InputText) IS NOT 0>
<cfthrow type="Application" message="Invalid characters detected">
</cfif>
The code will force an error if any special character other than a comma, underscore, or period is found in a text string. (You may want to handle the situation cleaner than just throwing an error.) I suggest you modify this as necessary based on the expected or allowed values in the fields you are validating. If you are validating a string of comma separated integers you may switch to use a more limiting regex like "[^0-9\ \,]" which will only allow numbers, commas, and spaces.
This answer will not escape the characters, it will not allow them in the first place. It should be used on any data that you will not use with <cfqueryparam>. Personally, I have only found a need for this when I use a dynamic sort field; not all databases will allow you to use bind variables with the ORDER BY clause.

Related

Quicksight breaking up strings for use of all aspects

I was wondering if anyone has every had experience with breaking a string up in quicksight and using certain aspects of the string. My example is a data set that returns tags like this "animals|funny|dog-park" I have used "split(tags,'|',1)" but then all that gets returned is the first part(animals). I have also tried a combination of ifelse->locate->split with no luck. Is there a way to split these tags to where they are all usable (animals) & (funny) or (funny) & (dog-park), etc.? Say the article associated will then be broken up into one tag but also another separately? I know this will end up being a calculated field most likely. Thank you in advance!

Since QuickSight does not support any form of nested fields (including objects and list) in analysis, you need to normalise this into separate rows before feeding the data to QuickSight.
Otherwise, if you leave it as is, you would be limited to filtering using string contains and doing string lookup in calculated fields - nevertheless you would not be able to use these tags as categories (such as in colours field well of visuals).

With stored procedures, is cfSqlType necessary?

To protect against sql injection, I read in the introduction to ColdFusion that we are to use the cfqueryparam tag.
But when using stored procedures, I am passing my variables to corresponding variable declarations in SQL Server:
DROP PROC Usr.[Save]
GO
CREATE PROC Usr.[Save]
(#UsrID Int
,#UsrName varchar(max)
) AS
UPDATE Usr
SET UsrName = #UsrName
WHERE UsrID=#UsrID
exec Usr.[get] #UsrID
Q: Is there any value in including cfSqlType when I call a stored procedure?
Here's how I'm currently doing it in Lucee:
storedproc procedure='Usr.[Save]' {
procparam value=Val(form.UsrID);
procparam value=form.UsrName;
procresult name='Usr';
}

This question came up indirectly on another thread. That thread was about query parameters, but the same issues apply to procedures. To summarize, yes you should always type query and proc parameters. Paraphrasing the other answer:
Since cfsqltype is optional, its importance is often underestimated:
Validation:
ColdFusion uses the selected cfsqltype (date, number, etcetera) to validate the "value". This occurs before any sql is ever sent to
the database. So if the "value" is invalid, like "ABC" for type
cf_sql_integer, you do not waste a database call on sql that was never
going to work anyway. When you omit the cfsqltype, everything is
submitted as a string and you lose the extra validation.
Accuracy:
Using an incorrect type may cause CF to submit the wrong value to the database. Selecting the proper cfsqltype ensures you are
sending the correct value - and - sending it in a non-ambiguous format
the database will interpret the way you expect.
Again, technically you can omit the cfsqltype. However, that
means CF will send everything to the database as a string.
Consequently, the database will perform implicit conversion
(usually undesirable). With implicit conversion, the interpretation
of the strings is left entirely up to the database - and it might
not always come up with the answer you would expect.
Submitting dates as strings, rather than date objects, is a
prime example. How will your database interpret a date string like
"05/04/2014"? As April 5th or a May 4th? Well, it depends. Change the
database or the database settings and the result may be completely
different.
The only way to ensure consistent results is to specify the
appropriate cfsqltype. It should match the data type of the target
column/function (or at least an equivalent type).

Is it bad practice to use cfquery inside cfloop?

I am having an array of structure. I need to insert all the rows from that array to a table.
So I have simply used cfquery inside cfloop to insert into the database.
Some people suggested me not to use cfquery inside cfloop as each time it will make a new connection to the database.
But in my case Is there any way I can do this without using cfloop inside cfquery?

Its not so much about maintaining connections as hitting the server with 'n' requests to insert or update data for every iteration in the cfloop. This will seem ok with a test of a few records, but then when you throw it into production and your client pushes your application to look around a couple of hundred rows then you're going to hit the database server a couple of hundred times as well.
As Scott suggests you should see about looping around to build a single query rather than the multiple hits to the database. Looping around inside the cfquery has the benefit that you can use cfqueryparam, but if you can trust the data ie. it has already been sanatised, you might find it easier to use something like cfsavecontent to build up your query and output the string inside the cfquery at the end.

I have used both the query inside loop and loop inside query method. While having the loop inside the query is theoretically faster, it is not always the case. You have to try each method and see what works best in your situation.
Here is the syntax for loop inside query, using oracle for the sake of picking a database.
insert into table
(field1, field2, etc)
select null, null, etc
from dual
where 1 = 2
<cfloop>
union
select <cfqueryparam value="#value1#">
, <cfqueryparam value="#value2#">
etc
from dual
</cfloop>

Depending on the database, convert your array of structures to XML, then pass that as a single parameter to a stored procedure.
In the stored procedure, do an INSERT INTO SELECT, where the SELECT statement selects data from the XML packet. You could insert hundreds or thousands of records with a single INSERT statement this way.
Here's an example.

There is a limit to how many <CFQUERY><cfloop>... iterations you can do when using <cfqueryparam>. This is also vendor specific. If you do not know how many records you will be generating, it is best to remove <cfqueryparam>, if it is safe to do so. Make sure your data is coming from trusted sources & is sanitised. This approach can save huge amounts of processing time, because it is only make one call to the database server, unlike an outer loop.

use fuzzy matching in django queryset filter

Is there a way to use fuzzy matching in a django queryset filter?
I'm looking for something along the lines of:
Object.objects.filter(fuzzymatch(namevariable)__gt=.9)
or is there a way to use lambda functions, or something similar in django queries, and if so, how much would it affect performance time (given that I have a stable set of ~6000 objects in my database that I want to match to)
(realized I should probably put my comments in the question)
I need something stronger than contains, something along the lines of difflib. I'm basically trying to get around doing a Object.objects.all() and then a list comprehension with fuzzy matching.
(although I'm not necessarily sure that doing that would be much slower than trying to filter based on a function, so if you have thoughts on that I'm happy to listen)
also, even though it's not exactly what I want, I'd be open to some kind of tokenized opposite-contains, like:
Object.objects.filter(['Virginia', 'Tech']__in=Object.name)
Where something like "Virginia Technical Institute" would be returned. Although case insensitive, preferably.

When you're using the ORM, the thing to understand is that everything you do converts to SQL commands and it's the performance of the underlying queries on the underlying database that matter. Case in point:
SELECT COUNT (*) ...
Is that fast? Depends on whether your database stores any records to give you that information - MySQL/MyISAM does, MySQL/InnoDB does not. In English - this is one lookup in MYISAM, and n in InnoDB.
Next thing - in order to do exact match lookups efficiently in SQL you have to tell it when you create the table - you can't just expect it to understand. For this purpose SQL has the INDEX statement - in django, use db_index=True in the field options of your model. Bear in mind that this has an added performance hit on writes (to create the index) and obviously extra storage is needed (for the datastructure) so you cannot "INDEX all the things". Also, I don't think it will help for fuzzy matching - but it's worth noting anyway.
Next consideration - how do we do fuzzy matching in SQL? Well apparently LIKE and CONTAINS allow a certain amount of searching and wildcard-results to be executed in SQL. These are T-SQL links - translate for your database server :) You can achieve this via Model.objects.get(fieldname__contains=value) which will produce LIKE SQL, or similar. There are a number of options available there for different lookups.
This may or may not be powerful enough for you - I'm not sure.
Now, for the big question: performance. Chances are if you're doing a contains search that the SQL server will have to hit all of the rows in the database - don't take my word on that, but it would be my bet - even with indexing on. With 6000 rows this might not take all that long; then again, if you're doing this on a per-connection-to-your-app basis it's probably going to create a slowdown.
Next thing to understand about the ORM: if you do this:
Model.objects.get(fieldname__contains=value)
Model.objects.get(fieldname__contains=value)
You will issue two queries to the database server. In other words, the ORM doesn't always cache the results - so you might just want to do an .all() and search in memory. Do read about caching and querysets.
Further on on that last page, you'll also see Q objects - useful for more complicated queries.
So in summary then:
SQL contains some basic fuzzy matching-like parameters.
Whether or not these are sufficient depends on your needs.
How they perform depends on your SQL server - definitely measure it.
Whether you can cache these results in memory depends on how likely scaling is - again might be worth measuring the memory commit as a result - if you can share between instances and if the cache will be frequently invalidated (if it will be, don't do it).
Ultimately, I'd start by getting your fuzzy matching working, then measure, then tweak, then measure until you work out how to improve performance. 99% of this I learnt doing exactly that :)

with postgres as database, you can use TrigramSimilarity to do fuzzy search and rank your results on different weight as well. Here is the link to documentation :
https://docs.djangoproject.com/en/2.0/ref/contrib/postgres/search/#trigram-similarity
For full text search you can refer to https://czep.net/17/full-text-search.html

If you need something stronger than contains lookup, have a look at regex lookups: https://docs.djangoproject.com/en/1.0/ref/models/querysets/#regex

Methods for preventing SQL Injection in ColdFusion

I'm wondering if the # symbol is enough.
This is a part of the sql command that I'm using
WHERE login='#FORM.login#' AND password COLLATE Latin1_General_CS_AS = '#FORM.password#'
I'm trying to test it with user names such as ' OR 1=1 and variants of it, but even though it's not working I don't want to have a false sense of security.
I've read that using <cfqueryparam> can prevent this form of attack, are there any other ways?

The way to go is <cfqueryparam>. It's simple, straight-forward, datatype-safe, can handle lists (for use with IN (...)) and can handle conditional NULLs. Plus you get a benefit out of it in loops - the query text itself is sent to the server only once, with each further loop iteration only parameter values are transferred.
You can use '#var#' and be relatively safe. In the context of a <cfquery> tag ColdFusion will expand the value of var with single quotes escaped, so there is some kind of automatic defense against SQL injection. But beware: This will — by design — not happen with function return values: For example, in '#Trim(var)#' single quotes won't be escaped. This is easily overlooked and therefore dangerous.
Also, it has a disadvantage when run in a loop: Since variable interpolation happens before the SQL is sent to the server, ColdFusion will generate a new query text with every iteration of a loop. This means more bytes over the wire and no query plan caching on the server, as every query text is different.
In short: Use <cfqueryparam> wherever you can:
WHERE
login = <cfqueryparam value="#FORM.login#" cfsqltype="CF_SQL_VARCHAR">
AND password = <cfqueryparam value='#Hash(FORM.password, "SHA-512")#' cfsqltype="CF_SQL_VARCHAR">
Instead of a simple Hash(), you should indeed use a salted hash, as #SLaks pointed out in his comment.

An even better way to go would be to use stored procedures for everything.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js