Returning keywords from parsed webpage - if-statement

I`m using "ImportHtml" coupled with a Find function to parse a webpage, I would like to return one of 3 keywords IF it is found on the webpage.
This is what I`m using (in google Spreadsheet)
=If(FIND("Limited",INDEX(ImportHtml("http://www.fakeurl.com";"table";2),1,4))>0,"LIMITED",0)
but I don`t know how to scale it up to 3 keywords instead of just this single one knowing that we can only ever find one of those keywords (for example not 2 out or 3).
any idea?

This should do it:
=If(OR(FIND("Limited",INDEX(ImportHtml("http://www.fakeurl.com";"table";2),1,4))>0,FIND("keyword2",INDEX(ImportHtml("http://www.fakeurl.com";"table";2),1,4))>0,FIND("keyword3",INDEX(ImportHtml("http://www.fakeurl.com";"table";2),1,4))>0),"LIMITED",0)
If it finds any of those three words it put "LIMITED" in the cell.
If you want to display which keyword it found use this:
=IF(not(isna(FIND("Limited",INDEX(ImportHtml("http://www.fakeurl.com";"table";2),1,4)))),"limited",IF(not(isna(FIND("keyword2",INDEX(ImportHtml("http://www.fakeurl.com";"table";2),1,4)))),"keyword2",IF(not(isna(FIND("keyword3",INDEX(ImportHtml("http://www.fakeurl.com";"table";2),1,4)))),"keyword3",0)))

Related

Script to generate html Beyond Compare folder differences

I've found several ways to automate folder comparison using scripts in Beyond Compare, but none that produce the pretty html report created from Session>Folder Compare Report>View in browser.
Here is an example of what that looks like.
I would love to be able to find the script that gives me that html difference report.
Thanks!
This is what I am currently getting
load "C:\Users\UIDQ5763\Desktop\Enviornment.cpp" &
"C:\Users\UIDQ5763\Desktop\GreetingsConsoleApp"
folder-report layout:side-by-side options:display-all &
output-to:C:\Users\UIDQ5763\Report.html output-options:html-color
The documentation for Beyond Compare's scripting language is here. You were probably missing either layout:side-by-side, which gives the general display, or output-options:html-color which is required to get the correct HTML stylized output. You may want to change options:display-all to options:display-mismatches if you only want to see the differences, and you might want to add an expand all command immediately before the folder-report line if you want to see the subfolders recursively.'
The & characters shown in the sample are line continuation characters. Remove them if you don't need to wrap your lines.

Django: How to get the truncated portion of “truncatewords” most directly?

I am new to Django. I'm trying to split a title into multiple lines depending on it's length so I can style the subsequent lines appropriately. I'm using something like {{ book.title|truncatewords:8}}
How can I get the 9th - 16th word? or the 16-24th words? (Something like using the |slice filter, but for words).
I'm trying to avoid playing around with the backend, so the solutions here are too involved: Is there a Django template filter that handles "...more" and when you click on it, it shows more of the text?
#user3279773 seemingly came up with a solution here involving using |safe but didn't post it (and I don't have enough rep to comment a shoutout):
Django: How to get the truncated portion of "truncatewords"
I tried to figure out the |safe solution with various permutations of my code and referenced the link below, but alas, to no avail.
https://docs.djangoproject.com/en/dev/ref/templates/builtins/#std:templatefilter-safe

HTML labels with ocamlgraph

Is it possible to make a graph like this one with ocamlgraph? HTML labels have to be delimited with <> instead of "" and I don't see any mention of this functionality in the documentation.
They can parse this kind of dot nodes: the documentation for the Dot_ast module of OCamlgraph has a Html of string case of the id type for this. It seems like they cannot print this kind of dot files, as the `Label node of the Dot attributes only handles direct strings.
If you need this feature, you could consider implementing it yourself (just change the files graphviz.ml and graphviz.mli), I'm sure the authors would be glad to have some contribution.

Cleansing string / input in Coldfusion 9

I have been working with Coldfusion 9 lately (background in PHP primarily) and I am scratching my head trying to figure out how to 'clean/sanitize' input / string that is user submitted.
I want to make it HTMLSAFE, eliminate any javascript, or SQL query injection, the usual.
I am hoping I've overlooked some kind of function that already comes with CF9.
Can someone point me in the proper direction?
Well, for SQL injection, you want to use CFQUERYPARAM.
As for sanitizing the input for XSS and the like, you can use the ScriptProtect attribute in CFAPPLICATION, though I've heard that doesn't work flawlessly. You could look at Portcullis or similar 3rd-party CFCs for better script protection if you prefer.
This an addition to Kyle's suggestions not an alternative answer, but the comments panel is a bit rubbish for links.
Take a look a the ColdFusion string functions. You've got HTMLCodeFormat, HTMLEditFormat, JSStringFormat and URLEncodedFormat. All of which can help you with working with content posted from a form.
You can also try to use the regex functions to remove HTML tags, but its never a precise science. This ColdFusion based regex/html question should help there a bit.
You can also try to protect yourself from bots and known spammers using something like cfformprotect, which integrates Project Honeypot and Akismet protection amongst other tools into your forms.
You've got several options:
"Global Script Protection" Administrator setting, which applies a regular expression against post and get (i.e. FORM and URL) variables to strip out <script/>, <img/> and several other tags
Use isValid() to validate variables' data types (see my in depth answer on this one).
<cfqueryparam/>, which serves to create SQL bind parameters and validate the datatype passed to it.
That noted, if you are really trying to sanitize HTML, use Java, which ColdFusion can access natively. In particular use the OWASP AntiSamy Project, which takes an HTML fragment and whitelists what values can be part of it. This is the same approach that sites like SO and slashdot.org use to protect submissions and is a more secure approach to accepting markup content.
Sanitation of strings in coldfusion and in quite any language is very important and depends on what you want to do with the string. most mitigations are for
saving content to database (e.g. <cfqueryparam ...>)
using content to show on next page (e.g. put url-parameter in link or show url-parameter in text)
saving files and using upload filenames and content
There is always a risk if you follow the idea to prevent and reduce a string by allow basically everything in the first step and then sanitize malicious code "away" by deleting or replacing characters (blacklist approach).
The better solution is to replace strings with rereplace(...) agains regular expressions that explicitly allow only the characters needed for the scenario you use it as an easy solution, whenever this is possible. use cases are inputs for numbers, lists, email-addresses, urls, names, zip, cities, etc.
For example if you want to ask for a email-address, you could use
<cfif reFindNoCase("^[A-Z0-9._%+-]+#[A-Z0-9.-]+\.(?:[A-Z]{5})$", stringtosanitize)>...ok, clean...<cfelse>...not ok...</cfif>
(or an own regex).
For HTML-Imput or CSS-Imput I would also recommend OWASP Java HTML Sanitizer Project.

HTML templating in C++ and translations

I'm using HTML_Template for templating in my C++-based web app (don't ask). I chose that because it was very simple and it turns out to be a good solution.
The only problem right now is that I would like to be able to include translatable strings in the HTML templates (HTML_Template does not really support that).
Ultimately, what I would like is to have a single file that contains all the strings to be translated. It can then be given to a translator and plugged back in to the app and used depending on which language the user chose in settings.
I've been going back and forth on some options and was wondering what others felt was the best choice (or if there's a better choice that isn't listed)
Extend HTML_Template to include a tag for holding the literal string to translate. So, for example, in the HTML I would put something like
<TMPL_TRANS "this is the text to translate"/>
Use a completely separate scheme for translation and preprocess the HTML files to generate the final template files (without the special translation lingo). For example, in the pre-processed file, translatable text would look like this:
{{this is the text to translate}}
and the final would look like:
this is the text to translate
Don't do anything and let the translators find the string to translate in the html and js files themselves.
You may want to consider arrays, if not already.
A popular implementation for translating strings is to use tables and indices. One index is for the language and the second index is for the string. Create a function that returns strings based on these two indices:
const std::string& Get_String(unsigned int language_index, unsigned int string_index);
Each language would have a table of strings (or const char *). There would be a table of pointers to language tables, one for each supported language.
The biggest pain is to convert existing code to use this system.
Hope this helps.