How do I append multiple criteria in elastic search URI? - list

I don't have the possibility of sending a body request, so I have to concatenate the criteria in the main URI. Right now I have something like: server:port/index/_search?size=10, for a pagination with the size of 10, I need to add the "from" criteria also, but nothing I've tried works: / or , .

Correct format would be http://localhost:9200/<your-index>/_search?from=10&size=20,
You need to add size with the separation of &.

Related

Remove Varying Domain Names from URLs Using a Calculated Field / REGEX

I am attempting to create a calculated field called "Page" using a custom CSV upload data source - a list of URLs from several websites combined into one big list of thousands of URLs on dozens of different domain names in Datastudio.
In the CSV, I have a field called "URL". It contains each pages full URL including the Root Domain Name.
I have another field for each of those records called "Root Domain Name". It has the Root Domain Name for each URL.
I'd like to extract the Root Domain Name identified in one field from the URL identified in another field, leaving just the "Page" path. The URLs vary in terms of Top Level Domain - some are .com, some are .co.uk, some are .fr, etc.
Ultimately, the output would be something like this:
www.domain.com/test-page - > /test-page
www.domain.co.uk/test-page - > /test-page
www.domain.fr/test-page -> /test-page
etc.
It seems like it'd be something like this but obviously isn't working hence my presence here today:
REGEXP_REPLACE(URL,Root Domain Name,'')
I am thinking that removing the value of one field from the value of another is one way of getting at it but there might be a better way of simply manipulating the URL field to remove everything prior to the 3rd / too.
I need to keep the first / after the Domain Name (a data formatting issue).
I'll be plugging away at this one and figured there has to be someone out there that has seen this before so welcome any input.
Have a good day everyone.
I use the following to decompose full URL in Data Studio:
URL Path: REGEXP_EXTRACT(Url, '^https?://[^/]+(/[^?#]*)')
URL Query: REGEXP_EXTRACT(Url, '[^?]+\\?(.+)')

Drupal 8: Altering Search API queries

I'm working on a project which includes the following activated modules:
Drupal core 8.2.3
Database Search 8.x-1.0-beta4
Search API 8.x-1.0-beta4
Search API Term Handlers 8.x-1.0-beta4
Views 8.2.3
I have a list of nids which need to be excluded from the search result of the site-wide search. The search uses Search API and has been setup using Views.
The table in the database is: "search_api_db_default_index"
The field I wish to target is: "nid"
I wasn't able to get HOOK__search_api_query_alter or HOOK_search_api_results_alter to fire, so I am attempting to manipulate the query through HOOK_views_query_alter.
I have attempted to use both the "addWhere" and "addCondition" methods with the following syntax:
When using the addCondition method, I attempted
$query->addCondition('search_api_db_default_index.nid', $oneBadNid, '<>');
and
$query->addCondition('search_api_db_default_index.nid', $manyBadNids, 'NOT IN');
and when using the addWhere method, I attempted
$query->addWhere('AND', 'search_api_index_default_index.nid', $oneBadNid, '<>');
and
$query->addWhere('AND', 'search_api_index_default_index.nid', $manyBadNids, 'NOT IN');
Regardless of whether or not I prefix the field with the table name, searching always results in triggering the following notice:
Unknown field in filter clause: 'search_api_db_default_index.nid' .
It seems that the field name is always wrapped in an html encoded string representing a single quotation, but this occurs both when using double quotations or single quotations around the supplied table.field parameter.
I am not even sure that this is what is keeping me from altering my query, but it is the only thing close to an error which I have discovered in this process. It's also possible that I'm simply not supposed to be targeting the table in the manner written, but I did not find any documentation directing me to the proper methodology.
I would appreciate any insight into this issue! Thanks!
Generally you can use
$fields = $query->getIndex()->getFields();
on the query to get an array of fields you can use within the search_api query.
Piggy-backing off of Nebel54's comment, and attempting this on my own, you don't need to include the 'table' name when setting the addCondition. However, I did need to use hook_search_api_query_alter over a views-specific one.
function mymodule_search_api_query_alter(\Drupal\search_api\Query\QueryInterface &$query) {
// Ensure field_myfield is being indexed
$fields = $query->getIndex()->getFields();
if (isset($fields['field_myfield'])) {
$query->addCondition('field_myfield', 'myvalue', '<>');
}
}

Amazon CloudSearch: Is it possible to write a query that returns... everything?

I've put together a simple search form, with a search box and a couple of filters as dropdowns. Everything works as you'd expect, except that I want the behavior to be that when the user leaves everything completely blank (no search query, no filters) they simply get everything returned (paginated of course).
I'm currently achieving this by detecting this special case and querying my local database, but there are some advantages to doing it 100% with CloudSearch. Is there a way to build a request that simply returns a paginated list of every document? In other words, is there a CloudSearch equivalent to "SELECT id FROM x LIMIT n?"
Thanks in advance!
Joe
See the Search API.
?q=matchall&q.parser=structured will match all the documents.
These easiest way would be to use a not operator, so for example:
?q=dog|-dog
would return all documents that contained 'dog' and also did not contain 'dog'. You would need to intercept the special case, as you are already, and just substitute a query/not query combo and you should get everything back.
For someone looking for an answer using boto3.
CLOUD_SEARCH_CLIENT = boto3.client(
'cloudsearchdomain',
aws_access_key_id='',
aws_secret_access_key='',
region_name='',
endpoint_url="https://search-your-endpoint-url.amazonaws.com"
)
response = CLOUD_SEARCH_CLIENT.search(
query="matchall",
queryParser='structured'
)
print(response)

REST and Filtering records

I currently have a .NET method that looks like this - GetUsers(Filter filter) and this is invoked from the client by sending a SOAP request. As you can probably guess, it takes a bunch of filters and returns a list of users that match those filters. The filters aren't too complicated, they're basically just a set of from date, to date, age, sex etc. and the set of filters I have at any point is static.
I was wondering what the RESTful way of doing this was. I'm guessing I'll have a Users resource. Will it be something like GET /Users?fromDate=11-1-2011&toDate=11-2-2011&age=&sex=M ? Is there a way to pass it a Filter without having to convert it into individual attributes?
I'm consuming this service from a mobile client, so I think the overhead of an extra request that actually creates a filter: POST filters is bad UX. Even if we do this, what does POST filters even mean? Is that supposed to create a filter record in the database? How would the server keep track of the filter that was just created if my sequence of requests is as follows?
POST /filters -- returns a filter
{
"from-date" : "11-1-2011",
"to-date" : "11-2-2011",
"age" : "",
"sex" : "M"
}
GET /Users?filter=filter_id
Apologies if the request came off as being a little confused. I am.
Thanks,
Teja
We are doing it just like you had it
GET /Users?fromDate=11-1-2011&toDate=11-2-2011&age=&sex=M
We have 9 querystring values.
I don't see any problem with that
The way I handle it is I do a POST with the body containing the parameters and then I return a redirect to a GET. What the GET URL looks like is completely up to the server. If you want to convert the filter into separate query params you can, or if you want to persist a filter and then add a query param that points to the saved filter that's ok too. The advantage is that you can change your mind at any time and the client doesn't care.
Also, because you are doing a GET you can take advantage of caching which should more than make up for doing the extra retquest.

How I can encode/escape a varchar to be more secure without using cfqueryparam?

How I can encode/escape a varchar to be more secure without using cfqueryparam? I want to implement the same behaviour without using <cfqueryparam> to get around "Too many parameters were provided in this RPC request. The maximum is 2100" problem. See: http://www.bennadel.com/blog/1112-Incoming-Tabular-Data-Stream-Remote-Procedure-Call-Is-Incorrect.htm
Update:
I want the validation / security part, without generating a prepared-statement.
What's the strongest encode/escape I can do to a varchar inside <cfquery>?
Something similar to mysql_real_escape_string() maybe?
As others have said, that length-related error originates at a deeper level, not within the queryparam tag. And it offers some valuable protection and therefore exists for a reason.
You could always either insert those values into a temporary table and join against that one or use the list functions to split that huge list into several smaller lists which are then used separately.
SELECT name ,
..... ,
createDate
FROM somewhere
WHERE (someColumn IN (a,b,c,d,e)
OR someColumn IN (f,g,h,i,j)
OR someColumn IN (.........));
cfqueryparam performs multiple functions.
It verifies the datatype. If you say integer, it makes sure there is an integrer, and if not, it does nto allow it to pass
It separates the data of a SQL script from the executable code (this is where you get protection from SQL injection). Anything passed as a param cannot be executed.
It creates bind variables at the DB engine level to help improve performance.
That is how I understand cfqueryparam to work. Did you look into the option of making several small calls vs one large one?
It is a security issue. Stops SQL injections
Adobe recommends that you use the cfqueryparam tag within every cfquery tag, to help secure your databases from unauthorized users. For more information, see Security Bulletin ASB99-04, "Multiple SQL Statements in Dynamic Queries," at www.adobe.com/devnet/security/security_zone/asb99-04.html, and "Accessing and Retrieving Data" in the ColdFusion Developer's Guide.
The first thing I'd be asking myself is "how the heck did I end up with more than 2100 params in a single query?". Because that in itself should be a very very big red flag to you.
However if you're stuck with that (either due to it being outwith your control, or outwith your motivation levels to address ;-), then I'd consider:
the temporary table idea mentioned earlier
for values over a certain length just chop 'em in half and join 'em back together with a string concatenator, eg:
*
SELECT *
FROM tbl
WHERE col IN ('a', ';DROP DATABAS'+'E all_my_data', 'good', 'etc' [...])
That's a bit grim, but then again your entire query sounds grim, so that might not be such a concern.
param values that are over a certain length or have stop words in them or something. This is also quite a grim suggestion.
SERIOUSLY go back over your requirement and see if there's a way to not need 2100+ params. What is it you're actually needing to do that requires all this???
The problem does not reside with cfqueryparam, but with MsSQL itself :
Every SQL batch has to fit in the Batch Size Limit: 65,536 * Network Packet Size.
Maximum size for a SQL Server Query? IN clause? Is there a Better Approach
And
http://msdn.microsoft.com/en-us/library/ms143432.aspx
The few times that I have come across this problem I have been able to rewrite the query using subselects and/or table joins. I suggest trying to rewrite the query like this in order to avoid the parameter max.
If it is impossible to rewrite (e.g. all of the multiple parameters are coming from an external source) you will need to validate the data yourself. I have used the following regex in order to perform a safe validation:
<cfif ReFindNoCase("[^a-z0-9_\ \,\.]",arguments.InputText) IS NOT 0>
<cfthrow type="Application" message="Invalid characters detected">
</cfif>
The code will force an error if any special character other than a comma, underscore, or period is found in a text string. (You may want to handle the situation cleaner than just throwing an error.) I suggest you modify this as necessary based on the expected or allowed values in the fields you are validating. If you are validating a string of comma separated integers you may switch to use a more limiting regex like "[^0-9\ \,]" which will only allow numbers, commas, and spaces.
This answer will not escape the characters, it will not allow them in the first place. It should be used on any data that you will not use with <cfqueryparam>. Personally, I have only found a need for this when I use a dynamic sort field; not all databases will allow you to use bind variables with the ORDER BY clause.