Is there a way to exclude certain keywords from a regexp_match statement in Tableau? - regex

I am trying edit the calculation field and pull in filenames that contain the string 'NDA.' However, filenames that contain 'STANDARD' also get pulled in error. Is there a way to do this in Tableau? I have tried the follow but it becomes too restrictive and the majority of files I'd expect to pull don't get pulled no more.
IF REGEXP_MATCH(UPPER([Name]),'_NDA|NDA_|_NDA_|NDA<>STANDARD')THEN "Nondisclosure Agreement"

You can try creating it as a separate IF statement:
IF REGEXP_MATCH(UPPER([Name]),'STANDARD') THEN "Whatever you want here"
ELSE IF REGEXP_MATCH(UPPER([Name]),'_NDA|NDA_|_NDA_')THEN "Nondisclosure Agreement"
On an unrelated note, you should think about using Contains instead of Regexp_match, since its usally better from a performance point of view.

Related

l10n/i18n: how to handle phrases with dynamic list of items?

What's the sanest way to handle translation and localization of dynamic lists?
Let's say I've queried the database, and got a list ["Foos", "Bars", "Bazes"]. Let's also assume the list always contain at least two items - I'll be sure to use a different translation for the single-item case.
What should I do if I need a phrase like "We have a wide choice of Foos, Bars and Bazes in our code"? (assuming that list items are dynamic so I can't just pre-translate all the possible permutations, and need to do things at runtime.)
I see at least the following issues:
I need to inflect all the items to the correct form (are there languages where different forms are required depending on the position in the list?)
Different locales may have drastically different rules how to join items.
E.g. CJK locales need "、" instead of ",".
And AFAIK in Chinese there will be "及" or "和" - depending on the full phrase - before the last item, so I guess there's some ambiguity with translating "and".
And, as I've read, some languages may avoid punctuation like it's used in English, but have other concepts instead, e.g. Arabic translator may prefer use "و" before every item (although they also have commas, "،"). Not sure if true or not - I don't know Arabic, just saw it mentioned.
My problem is, I don't even know what tooling may help me here. I don't have any particular programming language requirements, although Python or JavaScript would be the best. But I guess I can run just about anything, as I can probably build a l10n microservice and query it from my project.
I've used GNU gettext before I've encountered this, but I haven't found anything that would help me in its APIs and data formats. The best I can imagine is _("We have a wide choice of %s in our code", list_text) and generate list_text using some DIY hacks. I'm not sure XLIFF format has anything like this as well. I've found i18n-list-generator on npm but it's way too simplicistic.
Have anyone dealt with something like this? What did you do? Is there any library out there that handles this - so I can take a look at its API and learn how it does things?
Here's how I would approach it:
No concatenation. All string joining needs to be done via format strings with placeholders.
Only use format strings that support named/numbered placeholders. E.g. {FOO} or $1 instead of %s (this is to allow for parameter reordering). Named placeholders are also better since they give more context to translators. Let's assume we're using {FOO}-style placeholders.
To render a list, I would use a couple of format strings, e.g.: joinItem = "{LIST}, {ITEM}" to append items to the list and joinLastItem = "{LIST} and {ITEM}" to append the last item. This will allow one to render strings like Foos, Bars and Bases, change punctuation and even reverse the ordering of the list, if necessary.
Finally, you can use the final format string, e.g. weHaveTheseItems = "We have a wide choice of {ITEMS} in our code", assuming the {ITEMS} gets replaced with the previously rendered string.
Shameless self-promotion: you may want to have a look at the Plurr library that supports such {FOO}-style placeholders, as well as plurals (something you will likely need for such messages). It supports JavaScript among other languages.
This is a pain, as you point out not all locales can be expected to support the ",,,,and" form.
Inspired by #GSerg and #Igor Afanasyev I came up with a GNU Gettext based solution like the following (pseudo gettext invocation):
GettextPlural(
// TRANSLATORS: For multiple "choices", each will be prefixed with a new-line (\n)
"We have a wide choice of {choices} in our code",
"In our code we have a wide choice of{choices}", choices.Count)
should print like:
"We have a wide choice of FOOs in our code"
"In our code we have a wide choice of
FOOs
BARs
BAZs"
Remember to stick the --add-comments=TRANSLATORS to your xgettext invocation.
For Web purposes you could use <ul><li>...</li>... </ul> or whatever instead of \n.
The benefit is that layout is at least as universal as UI layout, but you are still allowing non-English'ish locale plural forms.
Some languages have only one plural form so their translation must work with both a single choice and multiple choices, so in particular, they cannot have a conditional new-line.

Nested tables in livecycle fall apart on email

I have a form here with a nested table - where each table can dynamically grow, i.e., the inner table (w/ Transit No and Account No) and the outer table (Accounts by ID No). Here is an example:
(Behind the buttons:
Add - $.parent.tbl.Row.instanceManager.addInstance();
Remove - $.parent.instanceManager.removeInstance(this.parent.index); (In
production I make sure there is at least one row to remove...)
In the definition for each table I do not have checked 'Repeat Table for Each Data Item'. This works great. However I did try with that checked and the outcome was the same.
Now, when I email the form and open the attachment, this is what I see:
You can see that the second table didn't make it, and apparently a row was added to the inner table in the first, without any data.
Any ideas on what's going wrong here? And what I can do about it?
Unfortunately I'm not sure what's wrong with your form but I have made a similar form that works - so I can show you how I did it and list a few things that I can think of that can cause problems.
This is what my form looks like and when I e-mail it, it comes out exactly the way it is:
(It has repeatable parent- and childsubforms like yours)
I did it entirely with JS though, no FormCalc and Dollar $igns :D
When a button is pressed I call a function from a Scriptobject.
These are the main parts of my script inside my functions:
Adding a Subform:
var oNewInstance = subform.instanceManager.addInstance(1);
Deleting a Subform:
if (subform.instanceManager.count > subform.instanceManager.occur.min)
{
subform.instanceManager.removeInstance(subform.index);
}
And these are my subforms' properties (in German, but you can figure it out :P):
Your problem might also have completely other reasons though, make sure you don't have any changes in an initialize,docReady, preSubmit and similar actions that occur between sending and opening the sent PDF.
Also before sending it as an e-mail you have to save it in Acrobat as a Reader Extended PDF:
Besides that I've noticed that sometimes problems can occur due to the target version (Selectable in LCD under File > Form Properties > Defaults).
It helped me sometimes to set it to the newest one.

Match all characters in group except for first and last occurrence

Say I request
parent/child/child/page-name
in my browser. I want to extract the parent, children as well as page name. Here are the regular expressions I am currently using. There should be no limit as to how many children there are in the url request. For the time being, the page name will always be at the end and never be omitted.
^([\w-]{1,}){1} -> Match parent (returns 'parent')
(/(?:(?!/).)*[a-z]){1,}/ -> Match children (returns /child/child/)
[\w-]{1,}(?!.*[\w-]{1,}) -> Match page name (returns 'page-name')
The more I play with this, the more I feel how clunky this solution is. This is for a small CMS I am developing in ASP Classic (:(). It is sort of like the MVC routing paths. But instead of calling controllers and functions based on the URL request. I would be travelling down the hierarchy and finding the appropriate page in the database. The database is using the nested set model and is linked by a unique page name for each child.
I have tried using the split function to split with a / delimiter however I found I was nested so many split statements together it became very unreadable.
All said, I need an efficient way to parse out the parent, children as well as page name from a string. Could someone please provide an alternative solution?
To be honest, I'm not even sure if a regular expression is the best solution to my problem.
Thank you.
You could try using:
^([\w-]+)(/.*/)([\w-]+)$
And then access the three matching groups created using Match.SubMatches. See here for more details.
EDIT
Actually, assuming that you know that [\w-] is all that is used in the names of the parts, you can use ^([\w-]+)(.*)([\w-]+)$ instead and it will handle the no-child case fine by itself as well.

SOAP - Why do I need to query for the original values for an update?

I'm taking over a project and wanted to understand if this is common practice using SOAP. The process that is currently in place I have to query all the values before I do an update cause I need to pass back all the values that are not being updated. Does this sound right?
Example Values:
fname=phill
lname=pafford
address=123 main
phone:222-555-1212
So if I just wanted to update the phone number I need to query for the record, get all the values and submit these values for an update.
Example Update Values:
fname=phill
lname=pafford
address=123 main
phone:111-555-1212
I just want to know if this is common practice or should I change the functionality of this?
This is not specific to SOAP. It may simply be how the service is designed. In general, there will be fields that can only be updated if you have the original value: you can't add one to a field unless you know the original value, for instance. The service seems to have been designed for the general case.
I don't think that it is a very "common" practice. However I've seen cases where the old values are posted together with the new values, in order to validate that noone else has updated the values in the meantime.

How do you allow the usage of an <img> while preventing XSS?

I'm using ASP.NET Web Forms for blog style comments.
Edit 1: This looks way more complicated then I first thought. How do you filter the src?
I would prefer to still use real html tags but if things get too complicated that way, I might go a custom route. I haven't done any XML yet, so do I need to learn more about that?
If IMG is the only thing you'd allow, I'd suggest you use a simple square-bracket syntax to allow it. This would eliminate the need for a parser and reduce a load of other dangerous edge cases with the parser as well. Say, something like:
Look at this! [http://a.b.c/m.jpg]
Which would get converted to
Look at this! <img src="http://a.b.c/m.jpg" />
You should filter the SRC address so that no malicious things get passed in the SRC part too. Like maybe
Look at this! [javascript:alert('pwned!')]
Use an XML parser to validate your input, and drop or encode all elements, and attributes, that you do not want to allow. In this case, delete or encode all tags except the <img> tag, and all attributes from that except src, alt and title.
If you end up going with a non-HTML format (which makes things easier b/c you can literally escape all HTML), use a standard syntax like markdown. The markdown image syntax is ![alt text](/path/to/image.jpg)
There are others also, like Textile. Its syntax for images is !imageurl!
#chakrit suggested using a custom syntax, e.g. bracketed URLs - This might very well be the best solution. You DEFINITELY dont want to start messing with parsing etc.
Just make sure you properly encode the entire comment (according to the context - see my answer on this here Will HTML Encoding prevent all kinds of XSS attacks?)
(btw I just discovered a good example of custom syntax right there... ;-) )
As also mentioned, restrict the file extension to jpg/gif/etc - even though this can be bypassed, and also restrict the protocol (e.g. http://).
Another issue to be considered besides XSS - is CSRF (http://www.owasp.org/index.php/Cross-Site_Request_Forgery). If you're not familiar with this security issue, it basically allows the attacker to force my browser to submit a valid authenticated request to your application, for instance to transfer money or to change my password. If this is hosted on your site, he can anonymously attack any vulnerable application - including yours. (Note that even if other applications are vulnerable, its not your fault they get attacked, but you still dont want to be the exploit host or the source of the attack...). As far as your own site goes, it's that much easier for the attacker to change the users password on your site, for instance.