Working with strings in GCP-Workflows and GCP-Admin - google-cloud-platform

I'm integrating a project in GCP-Workflows with GCP-Admin, but I'm having trouble working with some data, when extracting a date it is delivered in this format: 2020-12-28T11: 20: 05.000Z, so I can't turn the string into int, and apparently there is no function in GCP like substring() either. I need to use the date with an IF, checking if it is greater or less than the reference.
How can I do this?

There is some lack of function implementation for now in Workflows. New ones are coming very soon. But I don't know if they will solve your problem
Anyway, with workflows, the correct pattern, if a built-in function isn't implemented, is to call an endpoint, for example a Cloud Function or a Cloud Run, which perform the transformation for you and return the expected result.
Quite boring to do, but don't hesitate to open feature request on the issues tracker product team is very reactive and love user feedbacks!

The Workflows standard library now includes a text module with functions for searching (including regular expressions), splitting, substrings and case transformations.

Related

Google Natural Language Sentiment Analysis incorrect result

We have Google Natural AI integrated into our product for Sentiment Analysis (https://cloud.google.com/natural-language). One of the customers complained that when they write "BAD" then it shows a positive sentiment.
On further investigation, we found that when google Sentiment Analysis Natural Language API is called with input as BAD or Bad (pls see its in all caps or first letter caps ), it identifies text as an entity (a location or consumer good) & sends back the result as Positive while when we write "bad" in all small case, it sends negative.
Has anyone faced a similar problem? How did you solve it?
One obvious way looks like converting text into a small case but that may break some other use cases (maybe where entities do not get analyzed due to a small case text). Another way which we are building is to use our own dictionary of words with sentiments before calling google APIs but that doesn't answer the said problem, which may occur with any other text.
Inputs will help us. Thank you!
The NLP API uses an underlying model that is neural in nature. The knowledge comes from training on real world text. It is normal to get different results for different capitalizations as they can relate to different uses of the same trigram, e.g. Mike (person), mike (microphone, slang), MIKE (military alphabet entry).
The second key aspect is that the model is tuned and meant to be used on larger pieces of text and not on single words, hence good results can not be expected in this case.

How to have Free text custom slot type on AWS lex

I need a slot type that accepts any type of input, The slot I want to point is to get review feedback from my clients.
After looking on with all possible option the only way to achieve is with training data on custom slot type that all search results provides as a solution, That is totally a nightmare with my case now.
I have provided 130+ sample data but it didn't work, 95% fails.
I also have some more slots that needs free text slot.
Do any one achieved free text, Need Help :(
You will definitely want to parse the input and validate it in a Lambda Function by checking the event.inputTranscript
More details on how to do that in either of my other answers here:
What is Amazon Lex inbuilt slot type for description or notes?
AWS Lex + Lambda - Intercepting all of next user response regardless of context - without defining sample utterances?
The closest you would find to this functionality is a slot you can use that would allow you to capture free text - AMAZON.SearchQuery.
You should go through the documentation for this slot type though as it comes with caveats for how it can be used, and test it before going ahead.
Other than this take a look at the 5 Techniques to Replace AMAZON.LITERAL and Improve Skill Accuracy blog post which covers techniques that can be used.
Check with this article we have a python code that generates slot type value.
https://aws.amazon.com/blogs/machine-learning/create-a-translator-chatbot-using-amazon-translate-and-amazon-lex/
Now I have 1000 values, But I am facing a different issue.
Slot type works different for slots AWS LEX

OpenLDAP regex search with shell script

For an OpenLDAP database I need to find all users that have a telephone number matching a regex pattern, and are in a given Organizational Unit.
According to this: LDAP search using regular expression it is impossible by an ldapsearch (what would have been my first choice otherwise).
I would like to do the least possible work on clientside, and querying all users from an organizational unit and filter them by a grep or something similar seems too resource consuming. Is there a better way to do it?
Also I'm not very familiar with shell, so I'm a little afraid of "sed", but I heard it's powerful and performs well in a regex filtering. If I'd need to do the filtering client side which would be the easiest way (not compromising performance)?
And about batched inputs. If I get a lot of partial phone numbers in a CSV file, and each partial number could have the type "prefix"/"postfix"/"regex" (so it's tow coloumns: type, and partialnumber), what would be the best performance-wise?
Should I just get all the users in the organization unit and filter them by the shell script (iterating through all the users and trying to match any of the numbers)?
Or should I make a query for every number (this is only a viable option if regex filter for attributes is possible in an ldap query).
At my level of knowledge the first one is the way to go but is there a better solution?
I'm using OpenLDAP 2.4.23 it that matters in any way.
The results of using regular expressions with LDAP data might not be what you expect. LDAP data is not strings, but specific types of data defined by the schema, and application must always retrieve the schema to learn how to deal with attribute values. The telephoneNumber attribute has a specific syntax, and regular expressions may not work. In general, matching rules must be used by LDAP clients to compare and match data sooted in a directory server. In fact, best practices are that applications must alway sure matching rules, not native language comparison operators or regular expressions. For more information, please see LDAP: Programming Practices and LDAP: Using Matching Rules.

Google Bot information?

Does anyone know any more details about google's web-crawler (aka GoogleBot)? I was curious about what it was written in (I've made a few crawlers myself and am about to make another) and if it parses images and such. I'm assuming it does somewhere along the line, b/c the images in images.google.com are all resized. It also wouldn't surprise me if it was all written in Python and if they used all their own libraries for most everything, including html/image/pdf parsing. Maybe they don't though. Maybe it's all written in C/C++. Thanks in advance-
you can find a bit about how googlebot works here:
http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=158587
for example the "fetch as googlebot" tool lets you see a page as Googlebot sees it.
The crawler is very likely written in C or C++, at least backrub's crawler was written in one of these.
Be aware that the crawler only takes a snapshot of the page, then stores it in a temporary database for later processing. The indexing and other attached algorithms will extract the data, for example the image references.
Officially allowed languages at Google, I think, are Python/C++/Java.
The bot likely uses all 3 for different tasks.

How do I programmatically sanitize ColdFusion cfquery parameters?

I have inherited a large legacy ColdFusion app. There are hundreds of <cfquery>some sql here #variable#</cfquery> statements that need to be parameterized along the lines of: <cfquery> some sql here <cfqueryparam value="#variable#"/> </cfquery>
How can I go about adding parameterization programmatically?
I have thought about writing some regular expression or sed/awk'y sort of solution, but it seems like somebody somewhere has tackled such a problem. Bonus points awarded for inferring the sql type automatically.
There's a queryparam scanner that will find them for you on RIAForge: http://qpscanner.riaforge.org/
There is a script referenced here: http://www.webapper.net/index.cfm/2008/7/22/ColdFusion-SQL-Injection that will do the majority of the heavy lifting for you. All you have to do is check the queries and make sure the syntax will parse properly.
There is no excuse for not using CFQueryParam, apart from it being much more secure, it is a performance boost and the best way to handle quoted values in character based column types.
Keep in mind that you may not be able to solve everything with <cfqueryparam>.
I've seen a number of examples where the order by field name is being passed in the query string, which is a slightly trickier problem to solve as you need to validate that in a more "manual" way.
<cf_inputFilter
scopes = "FORM,COOKIE,URL"
chars = "<,>,!,&,|,%,=,(,),',{,}"
tags="script,embed,applet,object,HTML">
We used this to counteract a recent SQL injection attack. We added it to the Application.cfm file for our site.
I doubt that there is a solution that will fit your needs exactly. The only option I see is to write your own recursive search that builds a report for you or use one of the apps/scripts that people have listed above. Basically, you are going to have to edit each page or approve all of the automated changes.