Google Directory API: Search for groups using a search term containing a space - google-admin-sdk

I am looking to search for Groups using the Google Directory API, as per the documentation at https://developers.google.com/admin-sdk/directory/v1/guides/search-groups#fields
In particular, I am searching on the "name" field using a prefix. However, I am encountering problems when the term used to search contains a space. According to the documentation Surround with single quotes ' if the query contains whitespace, but whilst that works with the = operator, for an exact match, it does not when using the : operator, for matching the start of the text. In this instance, the API returns a "Bad Request" status code.
Here's a code sample (in C#)
public Task<Groups> SearchGroupsAsync(DirectoryService service, string customer, string searchTerm, CancellationToken cancellationToken)
{
var groupsRequest = service.Groups.List();
groupsRequest.Query = $"name:{searchTerm}*";
groupsRequest.Customer = customer;
return groupsRequest.ExecuteAsync(cancellationToken);
}
The above works fine for any searchTerm that does not contain a space. As soon as a space is used in the search term, a "BadRequest" is returned.
I have all tried the following examples, and all get a BadRequest, with an error object being returned with the message Invalid Input: name:'Group *' for example
name:Group *
name:'Group '*
name:'Group *'
name:'Group '
What does work though, is the following:
name='Group 1'
However, I don't want to search with an exact match, just a prefix. Is it possible to search on groups with a Prefix containing a space?

Hi #Tim seems like we can use + character between words
In your case we can simply use name:Group to get groups which contains Group as part of its name
If you want group which matches Group 12344 , use name:Group+1*

Related

If cell contains '?' then formula X if not then copy value

At the moment I am busy with a spreadsheet to analyse results per url. The problem is that when I want to make a list of unique urls the urls with a parameter behind it (for example '?fbads') will be seen as unique, instead of that I need these results to be blended together with the main url. See example below:
https://www.holidayguru.nl/deal/accommodatie/luxe-strandvakantie-in-ijmuiden-5e25ba62-e001-4072-8eb5-b6c3b0e7e66f/?fbclid=IwA
&
https://www.holidayguru.nl/deal/accommodatie/luxe-strandvakantie-in-ijmuiden-5e25ba62-e001-4072-8eb5-b6c3b0e7e66f/
Should both be: https://www.holidayguru.nl/deal/accommodatie/luxe-strandvakantie-in-ijmuiden-5e25ba62-e001-4072-8eb5-b6c3b0e7e66f/
I already fixed this with a formula but I need one list with all urls. So I'm look for two options. Or in the
=LEFT(A11,FIND("?",A11)-1)
That I use right now I need to find a way how I can say. If you don't find a '?' than just copy cell A11
Or...
I have to work with an if fuction to say, if A11 contains '?' than execute =left fuction otherwise use A11.
I can't manage to get the formula working. Demo sheet is down below :). Thanks!
Example spreadsheet
Delete everything from Sheet1!A:A (including the header) and place the following in Sheet1!A1:
=ArrayFormula({"UNIQUE URLS"; UNIQUE(FILTER(REGEXEXTRACT(URLs!A2:A,"[^\?]+"),URLs!A2:A<>""))})
This will create the header (which you can change as you like within the formula itself) and a unique list of URLs as determined only by the portion before a question mark (if a question mark exists) or to the end of the original URL.
For your reference, the expression [^\?]+ means "a string of the greatest length that can be extracted without containing a literal question mark."
[ ] = "any of the characters contained herein"
[^ ] = "not any of these characters"
\ = literal marker (i.e., whatever is next will be treated as a literal character)
\? = literal question mark (using the literal marker before the ? is necessary, since alone, the ? has a separate special meaning in REGEX-type expressions)
+ = "one or more of the preceding character or group of characters"

How to split a string in db2?

I've some URL's in my cas_fnd_dwd_det table,
casi_imp_urls cas_code
----------------------------------- -----------
www.casiac.net/fnds/CASI/qnxp.pdf
www.casiac.net/fnds/casi/as.pdf
www.casiac.net/fnds/casi/vindq.pdf
www.casiac.net/fnds/CASI/mnip.pdf
how do i copy the letters between last '/' and '.pdf' to another column
expected outcome
casi_imp_urls cas_code
----------------------------------- -----------
www.casiac.net/fnds/CASI/qnxp.pdf qnxp
www.casiac.net/fnds/casi/as.pdf as
www.casiac.net/fnds/casi/vindq.pdf vindq
www.casiac.net/fnds/CASI/mnip.pdf mnip
the below URL's are static
www.casiac.net/fnds/CASI/
www.casiac.net/fnds/casi/
Advise, how do i select the codes between last '/' and '.pdf' ?
I would recommend to take a look at REGEXP_SUBSTR. It allows to apply a regular expression. Db2 has string processing functions, but the regex function may be the easiest solution. See SO question on regex and URI parts for different ways of writing the expression. The following would return the last slash, filename and the extension:
SELECT REGEXP_SUBSTR('http://fobar.com/one/two/abc.pdf','\/(\w)*.pdf' ,1,1)
FROM sysibm.sysdummy1
/abc.pdf
The following uses REPLACE and the pattern is from this SO question with the pdf file extension added. It splits the string in three groups: everything up to the last slash, then the file name, then the ".pdf". The '$1' returns the group 1 (groups start with 0). Group 2 would be the ".pdf".
SELECT REGEXP_REPLACE('http://fobar.com/one/two/abc.pdf','(?:.+\/)(.+)(.pdf)','$1' ,1,1)
FROM sysibm.sysdummy1
abc
You could apply LENGTH and SUBSTR to extract the relevant part or try to build that into the regex.
For older Db2 versions than 11.1. Not sure if it works for 9.5, but definitely should work since 9.7.
Try this as is.
with cas_fnd_dwd_det (casi_imp_urls) as (values
'www.casiac.net/fnds/CASI/qnxp.pdf'
, 'www.casiac.net/fnds/casi/as.pdf'
, 'www.casiac.net/fnds/casi/vindq.pdf'
, 'www.casiac.net/fnds/CASI/mnip.PDF'
)
select
casi_imp_urls
, xmlcast(xmlquery('fn:replace($s, ".*/(.*)\.pdf", "$1", "i")' passing casi_imp_urls as "s") as varchar(50)) cas_code
from cas_fnd_dwd_det

Wrong regexp query for elasticsearch

I have some problems with the regexp query for elasticsearch. In my index there's a text field with comma-separated numeric values (IDs), f.e.
2,140,3,2495
And I have the following query term:
"regexp" : {
"myIds" : {
"value" : "^2495,|,2495,|,2495$|^2495$",
"boost" : 1
}
}
But my result list is empty.
Let me say that I know that regexp queries are kind of slow but the index still exists and is filled with millions of documents so unfortunately it's not an option to restructure it. So I need a regex solution.
In ElasticSearch regex, patterns are anchored by default, the ^ and $ are treated as literal chars.
What you mean to use is "2495,.*|.*,2495,.*|.*,2495|2495" - 2495, at the start of string, ,2495, in the middle, ,2495 at the end or a whole string equal to 2495.
Or, you may use a simpler
"(.*,)?2495(,.*)?"
That means
(.*,)? - an optional text (not including line breaks) ending with ,
2495 - your value
(,.*)? - an optional text (not including line breaks) ending with ,
Here is an online demo showing how this expression works (not a proof though).
Ok, I got it to work but run in another problem now. I built the string as follows:
(.*,)?2495(,.*)?|(.*,)?10(,.*)?|(.*,)?898(,.*)?
It works good for a few IDs but if I have let's say 50 IDs, then ES throws an exception which says that the regexp is too complex to process.
Is there a way to simplify the regexp or restructure the query it selves?

cfsearch - Error executing query : org.apache.lucene.queryParser.ParseException: Cannot parse : Lexical error

I've got a basic cfsearch that works fine, but occasionally it can be broken with search strings like the following;
my search string]
"my search string
my search string[
my search: string
Any of the above will result in an error like;
Error executing query : org.apache.lucene.queryParser.ParseException: Cannot parse '"my search string': Lexical error at line 1, column 32. Encountered: after : "\"my search string"
I was thinking I could strip out those characters, but you might have a working search term with, say, two "" - ie. "my search string" - which is valid. Is there a preferable way to prepare a string for cfsearch?
So, in the example of:
"my search string
it would strip out the first ". But if the search term was:
"my search string"
all good - leave it alone. Any ideas?! Are there any other characters that can cause an error? For example, a hacker tried this;
XyOk,'.](.]]]'
Which caused an error.
Use the VerityClean UDF from CFLib to sanitize the Verity/Lucene search parameter. (NOTE: Add :, ^ and * to the pipe-delimited reBadChars variable so they will be stripped for Lucene.)
http://www.cflib.org/udf/verityClean

Why would regex to separate filename from extension not work in ColdFusion?

I'm trying to retrieve a filename without the extension in ColdFusion. I am using the following function:
REMatchNoCase( "(.+?)(\.[^.]*$|$)" , "Doe, John 8.15.2012.docx" );
I would like this to return an array like: ["Doe, John 8.15.2012","docx"]
but instead I always get an array with one element - the entire filename:["Doe, John 8.15.2012.docx"]
I tried the regex string above on rexv.org and it works as expected, but not on ColdFusion. I got the string from this SO question: Regex: Get Filename Without Extension in One Shot?
Does ColdFusion use a different syntax? Or am I doing something wrong?
Thanks.
Why you're not getting expected results...
The reason you are getting a one-item array with the whole filename is because your pattern matches the entire filename, and matches once.
It is capturing the two groups, but rematch returns arrays of matches, not arrays of the captured groups, so you don't see those groups.
How to solve the problem...
If you are dealing with simple files (i.e. no .htaccess or similar), then the simplest solution is to just use...
ListLast( filename , '.' )
....to get only the file extension and to get the name without extension you can do...
rematch( '.+(?=\.[^.]+$)' , filename )
This uses a lookahead to ensure there is a . followed by at least one non-. at the end of the string, but (since it's a lookahead) it is excluded from the match (so you only get the pre-extension part in your match).
To deal with non-extensioned files (e.g. .htaccess or README) you can modify the above regex to .+(?=(?:\.[^.]+)?$) which basically does the same thing except making the extension optional. However, there isn't a trivial way to get update the ListLast method for these (guess you'd need to check len(extension) LT len(filename)-1 or similar).
(optional) Accessing captured groups...
If you want to get at the actual captured groups, the closest native way to do this in CF is using the refind function, with the fourth argument set to true - however, this only gives you positions and lengths - requiring that you use mid to extract them yourself.
For this reason (amongst many others), I've created an improved regex implementation for CF, called cfRegex, which lets you return the group text directly (i.e. no messing around with mid).
If you wanted to use cfRegex, you can do so with your original pattern like so:
RegexMatch( '(.+?)(\.[^.]*$|$)' , filename , 1 , 0 , 'groups' )
Or with named arguments:
RegexMatch( pattern='(.+?)(\.[^.]*$|$)' , text=filename , returntype='groups' )
And you get returned an array of matches, within each element being an array of the captured groups for that match.
If you're doing lots of regex work dealing with captured groups, cfRegex is definitely better than doing it with CF's re methods.
If all you care about is getting the extension and/or the filename with extension excluded then the previous examples above are sufficient.
#Peter's response is great, however the approach is perhaps a bit longer-winded than necessary. One can do this with reMatch() with a slight tweak to the regex.
<cfscript>
param name="URL.filename";
sRegex = "^.+?(?=(?:\.[^.]+?)?$)";
aMatch = reMatch(sRegex, URL.filename);
writeDump(aMatch);
</cfscript>
This works on the following filename patterns:
foo.bar
foo
.htaccess
John 8.15.2012.docx
Explanation of the regex:
^ From the beginning of the string
.+? One or more (+) characters (.), but the fewest (?) that will work with the rest of the regex. This is the file name.
(?=) Look ahead. Make sure the stuff in here appears in the string, but don't actually match it. This is the key bit to NOT return any file extension that might be present.
(?: Group this stuff together, but don't remember it for a back reference.
. A dot. This is the separator between file name and file extension.
[^.]+? One or more (+) single ([]) non-dot characters (^.), again matching the fewest possible (?) that will allow the regex as a whole to work.
? (This is the one after the (?:) group). Zero or one of those groups: ie: zero or one file extensions.
$ To the end of the string
I've only tested with those four file name patterns, but it seems to work OK. Other people might be able to finetune it.
A few more ways of achieving the same result. They all execute in roughly the same amount of time.
<cfscript>
str = 'Doe, John 8.15.2012.docx';
// sans regex
arr1 = [
reverse( listRest( reverse( str ), '.' ) ),
listLast( str, '.' )
];
// using Java String lastIndexOf()
arr2 = [
str.substring( 0, str.lastIndexOf( '.' ) ),
str.substring( str.lastIndexOf( '.' ) + 1 )
];
// using listToArray with non-filename safe character replace
arr3 = listToArray( str.replaceAll( '\.([^\.]+)$', '|$1' ), '|' );
</cfscript>