Flash AS3 count capital letters? - regex

How could I count the number of capital letters in a string using flash as3?
for example
var thestring = "This is The String";
should return int 3
Thank you

// Starting string.
var thestring:String = "This is The String";
// Match all capital letters and check the length of the returned match array.
var caps:int = thestring.match(/[A-Z]/g).length;
trace(caps); // 3

One way to solve this is to convert the string to lower case and count the characters affected. That means you don't have to specify which characters to include in the category of "uppercase letters", which isn't trivial. This method supports accented characters such as É.
// Starting string.
var theString:String = "'Ö' is actually the Swedish word for 'island'";
var lowerCase : String = theString.toLowerCase();
var upperCount : int = 0;
for (var i:int = 0; i < theString.length; i++) {
if (theString.charAt(i) != lowerCase.charAt(i)) {
upperCount++;
}
}
trace(upperCount); // prints 2

Each letter in a string has a value that corresponds with that letter:
var myString:String = "azAZ";
trace(myString.charCodeAt(0));
trace(myString.charCodeAt(1));
trace(myString.charCodeAt(2));
trace(myString.charCodeAt(3));
// Output is 97, 122, 65, 90
The name.charCodeAt(x) returns the code of the letter at the position in the string, starting at 0.
From this output we know that a - z are values ranging from 97 to 122, and we also know, that A - Z are values ranging from 65 - 90.
With that, we can now make a For Loop to find capital letters:
var myString:String = "This is The String";
var tally:int = 0;
for (var i:int = 0; i < myString.length; i++)
{
if (myString.charCodeAt(i) >= 65 && myString.charCodeAt(i) <= 95)
{
tally += 1;
}
}
trace(tally);
// Output is 3.
The variable "tally" is used to keep track of the number of capital letters found. In the For Loop, we are seeing if the value of the current letter it is analyzing is between the values 65 and 90. If it is, it adds 1 to tally and then traces the total amount when the For Loop finishes.

Why be succint? I say, processing power is made to be used. So:
const VALUE_0:uint = 0;
const VALUE_1:uint = 1;
var ltrs:String = "This is JUST some random TexT. How many Caps?";
var cnt:int = 0;
for(var i:int = 0; i < ltrs.length; i++){
cnt += processLetter(ltrs.substr(i,1));
}
trace("Total capital letters: " + cnt);
function processLetter(char:String):int{
var asc:int = char.charCodeAt(0);
if(asc >= Keyboard.A && asc <= Keyboard.Z){
return VALUE_1;
}
return VALUE_0;
}
// Heh heh!

Related

How to split 1 long paragraph to 2 shorter paragraphs? Google Document

I want paragraphs to be up to 3 sentences only.
For that, my strategy is to loop on all paragraphs and find the 3rd sentence ending (see note). And then, to add a "\r" char after it.
This is the code I have:
for (var i = 1; i < paragraphs.length; i++) {
...
sentEnds = paragraphs[i].getText().match(/[a-zA-Z0-9_\u0590-\u05fe][.?!](\s|$)|[.?!][.?!](\s|$)/g);
//this array is used to count sentences in Hebrew/English/digits that end with 1 or more of either ".","?" or "!"
...
if ((sentEnds != null) && (sentEnds.length > 3)) {
lineBreakAnchor = paragraphs[i].getText().match(/.{10}[.?!](\s)/g);
paragraphs[i].replaceText(lineBreakAnchor[2],lineBreakAnchor[2] + "\r");
}
}
This works fine for round 1. But if I run the code again- the text after the inserted "\r" char is not recognized as a new paragraph. Hence, more "\r" (new lines) will be inserted each time the script is running.
How can I make the script "understand" that "\r" means new, separate paragraph?
OR
Is there another character/approach that will do the trick?
Thank you.
Note: I use the last 10 characters of the sentence assuming the match will be unique enough to make only 1 replacement.
Without modifying your own regex expression you can achieve this.
Try this approach to split the paragraphs:
Grab the whole content of the document and create an array of sentences.
Insert paragraphs with up to 3 sentences after original paragraphs.
Remove original paragraphs from hell.
function sentenceMe() {
var doc = DocumentApp.getActiveDocument();
var paragraphs = doc.getBody().getParagraphs();
var sentences = [];
// Split paragraphs into sentences
for (var i = 0; i < paragraphs.length; i++) {
var parText = paragraphs[i].getText();
//Count sentences in Hebrew/English/digits that end with 1 or more of either ".","?" or "!"
var sentEnds = parText.match(/[a-zA-Z0-9_\u0590-\u05fe][.?!](\s|$)|[.?!][.?!](\s|$)/g);
if (sentEnds){
for (var j=0; j< sentEnds.length; j++){
var initIdx = 0;
var sentence = parText.substring(initIdx,parText.indexOf(sentEnds[j])+3);
var parInitIdx = initIdx;
initIdx = parText.indexOf(sentEnds[j])+3;
parText = parText.substring(initIdx - parInitIdx);
sentences.push(sentence);
}
}
// console.log(sentences);
}
inThrees(doc, paragraphs, sentences)
}
function inThrees(doc, paragraphs, sentences) {
// define offset
var offset = paragraphs.length;
// Create paragraphs with up to 3 sentences
var k=0;
do {
var parText = sentences.splice(0,3).join(' ');
doc.getBody().insertParagraph(k + offset , parText.concat('\n'));
k++
}
while (sentences.length > 0)
// Remove paragraphs from hell
for (var i = 0; i < offset; i++){
doc.getBody().removeChild(paragraphs[i]);
}
}
In case you are wondering about the custom menu, here is it:
function onOpen() {
var ui = DocumentApp.getUi();
ui.createMenu('Custom Menu')
.addItem("3's the magic number", 'sentenceMe')
.addToUi();
}
References:
DocumentApp.Body.insertParagraph
Actually the detection of sentences is not an easy task.
A sentence does not always end with a dot, a question mark or an exclamation mark. If the sentence ends with a quote then punctuation rules in some countries force you to put the end of the sentence mark inside the quote:
John asked: "Who's there?"
Not every dot means an end of a sentence, usually the dot after an uppercase letter does not end the sentence, because it occurs after an initial. The sentence does not end after J. here:
The latest Star Wars movie has been directed by J.J. Abrams.
However, sometimes the sentence does end after a capital letter followed by a dot:
This project has been sponsored by NASA.
And abbreviations can make it very hard:
For more information check the article in Phys. Rev. Letters 66, 2697, 2013.
Having in mind these difficulties let's still try to get some expression which will work in "usual" cases.
Make a global match and substitution. Match
((?:[^.?!]+[.?!] +){3})
and substitute it with
\1\r
Demo
This looks for 3 sentences (a sentence is a sequence of not-dot, not-?, not-! characters followed by a dot, a ? or a ! and some spaces) and puts a \r after them.
UPDATED 2020-03-04
Try this:
var regex = new RegExp('((?:[a-zA-Z0-9_\\u0590-\\u05fe\\s]+[.?!]+\\s+){3})', 'gi');
for (var i = 1; i < paragraphs.length; i++) {
paragraphs[i].replaceText(regex, '$1\\r');
}

Backspace String Compare Leetcode Question

I have a question about the following problem on Leetcode:
Given two strings S and T, return if they are equal when both are typed into empty text editors. # means a backspace character.
Example 1:
Input: S = "ab#c", T = "ad#c"
Output: true
Explanation: Both S and T become "ac".
Example 2:
Input: S = "ab##", T = "c#d#"
Output: true
Explanation: Both S and T become "".
Example 3:
Input: S = "a##c", T = "#a#c"
Output: true
Explanation: Both S and T become "c".
Example 4:
Input: S = "a#c", T = "b"
Output: false
Explanation: S becomes "c" while T becomes "b".
Note:
1 <= S.length <= 200
1 <= T.length <= 200
S and T only contain lowercase letters and '#' characters.
Follow up:
Can you solve it in O(N) time and O(1) space?
My answer:
def backspace_compare(s, t)
if (s.match?(/[^#[a-z]]/) || t.match?(/[^#[a-z]]/)) || (s.length > 200 || t.length > 200)
return "fail"
else
rubular = /^[\#]+|([^\#](\g<1>)*[\#]+)/
if s.match?(/#/) && t.match?(/#/)
s.gsub(rubular, '') == t.gsub(rubular, '')
else
new_s = s.match?(/#/) ? s.gsub(rubular, '') : s
new_t = t.match?(/#/) ? t.gsub(rubular, '') : t
new_s == new_t
end
end
end
It works in the terminal and passes the given examples, but when I submit it on leetcode it tells me Time Limit Exceeded. I tried shortening it to:
rubular = /^[\#]+|([^\#](\g<1>)*[\#]+)/
new_s = s.match?(/#/) ? s.gsub(rubular, '') : s
new_t = t.match?(/#/) ? t.gsub(rubular, '') : t
new_s == new_t
But also the same error.
So far, I believe my code fulfills the O(n) time, because there are only two ternary operators, which overall is O(n). I'm making 3 assignments and one comparison, so I believe that fulfills the O(1) space complexity.
I have no clue how to proceed beyond this, been working on it for a good 2 hours..
Please point out if there are any mistakes in my code, and how I am able to fix it.
Thank you! :)
Keep in mind that with N <= 200, your problem is more likely to be linear coefficient, not algorithm complexity. O(N) space is immaterial for this; with only 400 chars total, space is not an issue. You have six regex matches, two of which are redundant. More important, regex is slow processing for such a specific application.
For speed, drop the regex stuff and do this one of the straightforward, brute-force ways: run through each string in order, applying the backspaces as appropriate. For instance, change both the backspace and the preceding letter to spaces. At the end of your checking, remove all the spaces in making a new string. Do this with both S and T; compare those for equality.
It may be easiest to start at the end of the string and work towards the beginning:
def process(str)
n = 0
str.reverse.each_char.with_object('') do |c,s|
if c == '#'
n += 1
else
n.zero? ? (s << c) : n -= 1
end
end.reverse
end
%w|ab#c ad#c ab## c#d# a##c #a#c a#c b|.each_slice(2) do |s1, s2|
puts "\"%s\" -> \"%s\", \"%s\" -> \"%s\" %s" %
[s1, process(s1), s2, process(s2), (process(s1) == process(s2)).to_s]
end
"ab#c" -> "ac", "ad#c" -> "ac" true
"ab##" -> "", "c#d#" -> "" true
"a##c" -> "c", "#a#c" -> "c" true
"a#c" -> "c", "b" -> "b" false
Let's look at a longer string.
require 'time'
alpha = ('a'..'z').to_a
#=> ["a", "b", "c",..., "z"]
s = (10**6).times.with_object('') { |_,s|
s << (rand < 0.4 ? '#' : alpha.sample) }
#=> "h####fn#fjn#hw###axm...#zv#f#bhqsgoem#glljo"
s.size
#=> 1000000
s.count('#')
#=> 398351
and see how long it takes to process.
require 'time'
start_time = Time.now
(u = process(s)).size
#=> 203301
puts (Time.now - start_time).round(2)
#=> 0.28 (seconds)
u #=> "ffewuawhfa...qsgoeglljo"
As u will be missing the 398351 pound signs in s, plus an almost equal number of other characters removed by the pound signs, we would expect u.size to be about:
10**6 - 2 * s.count('#')
#=> 203298
In fact, u.size #=> 203301, meaning that, at the end, 203301 - 203298 #=> 3 pound signs were unable to remove a character from s.
In fact, process can be simplified. I leave that as an exercise for the reader.
class Solution {
public boolean backspaceCompare(String s, String t) {
try {
Stack<Character> st1 = new Stack<>();
Stack<Character> st2 = new Stack<>();
st1 = convertToStack(s);
st2 = convertToStack(t);
if (st1.size() != st2.size()) {
return false;
} else {
int length = st1.size();
for (int i = 0; i < length; i++) {
if (st1.peek() != st2.peek())
return false;
else {
st1.pop();
st2.pop();
}
if (st1.isEmpty() && st2.isEmpty())
return true;
}
}
} catch (Exception e) {
System.out.print(e);
}
return true;
}
public Stack<Character> convertToStack(String s){
Stack<Character> st1 = new Stack<>();
for (int i = 0; i < s.length(); i++) {
if (s.charAt(i) != '#') {
st1.push(s.charAt(i));
} else if (st1.empty()) {
continue;
} else {
st1.pop();
}
}
return st1;
}
}

Regular Expression for conditional replacement of parts of string in US phone number mask (Swift compatible)

I try to come up with regular expression patter that fulfills such requirements.
it is US phone number format wit 3 groups
I have input strings like this
(999) 98__-9999 here there is extra _ at the end of second section which I want to delete
(999) 9_8_-9999 here there is extra _ at the end of second section I want to delete
(999) 9_-9999 here if second group length is < 3 and ends with _ there should be added _ to pad second group to 9__ (3 characters)
(999) 98-9999 here if second group length is equal to 3 or it ends with digit there shouldn't be any modifications
To sum up:
If secondGroup.length > 3 && secondGroup.lastCharacter == '_' I want to remove this last character
else if secondGroup.length < 3 && secondGroup.lastCharacter == '_' I wan to append "_" (or pad wit underscore to have 3 characters in total)
else leave second group as in the input string.
The same should be applied to first group. The difference are the different delimiters i.e. (xxx) in first group and \sxxx- in second group
Here is my Swift code I have used to achieve it in brute force way by manually manipulating the string: (length 4 instead of 3 takes into account first delimiter like ( or \s. )
var componentText = ""
let idx1 = newText.index(of: "(")
let idx2 = newText.index(of: ")")
if let idx1 = idx1, let idx2 = idx2 {
var component0 = newText[..<idx1]
var component1 = newText[idx1..<idx2]
if component1.count > 4 && component1.last == "_" {
component1.popLast()
} else if component1.count < 4 && component1.last == "_" {
component1.append("_")
}
componentText += "\(component0)\(component1))"
} else {
componentText = newText
}
let idx3 = newText.index(of: " ")
let idx4 = newText.index(of: "-")
if let idx2 = idx2, let idx3 = idx3, let idx4 = idx4 {
var component2 = newText[idx2..<idx3]
component2.popFirst()
var component3 = newText[idx3..<idx4]
var component4 = newText[idx4...]
if component3.count > 4 && component3.last == "_" {
component3.popLast()
} else if component3.count < 4 && component3.last == "_" {
component3.append("_")
}
componentText += "\(component2) \(component3)-\(component4)"
} else {
componentText = newText
}
newText = componentText != "" ? componentText : newText
I think that using regular expression this code could be more flexible and much shorter.

How to highlight a string within a string ignoring whitespace and non alphanumeric chars?

What is the best way to produce a highlighted string found within another string?
I want to ignore all character that are not alphanumeric but retain them in the final output.
So for example a search for 'PC3000' in the following 3 strings would give the following results:
ZxPc 3000L = Zx<font color='red'>Pc 3000</font>L
ZXP-C300-0Y = ZX<font color='red'>P-C300-0</font>Y
Pc3 000 = <font color='red'>Pc3 000</font>
I have the following code but the only way i can highlight the search within the result is to remove all the whitespace and non alphanumeric characters and then set both strings to lowercase. I'm stuck!
public string Highlight(string Search_Str, string InputTxt)
{
// Setup the regular expression and add the Or operator.
Regex RegExp = new Regex(Search_Str.Replace(" ", "|").Trim(), RegexOptions.IgnoreCase);
// Highlight keywords by calling the delegate each time a keyword is found.
string Lightup = RegExp.Replace(InputTxt, new MatchEvaluator(ReplaceKeyWords));
if (Lightup == InputTxt)
{
Regex RegExp2 = new Regex(Search_Str.Replace(" ", "|").Trim(), RegexOptions.IgnoreCase);
RegExp2.Replace(" ", "");
Lightup = RegExp2.Replace(InputTxt.Replace(" ", ""), new MatchEvaluator(ReplaceKeyWords));
int Found = Lightup.IndexOf("<font color='red'>");
if (Found == -1)
{
Lightup = InputTxt;
}
}
RegExp = null;
return Lightup;
}
public string ReplaceKeyWords(Match m)
{
return "<font color='red'>" + m.Value + "</font>";
}
Thanks guys!
Alter your search string by inserting an optional non-alphanumeric character class ([^a-z0-9]?) between each character. Instead of PC3000 use
P[^a-z0-9]?C[^a-z0-9]?3[^a-z0-9]?0[^a-z0-9]?0[^a-z0-9]?0
This matches Pc 3000, P-C300-0 and Pc3 000.
One way to do this would be to create a version of the input string that only contains alphanumerics and a lookup array that maps character positions from the new string to the original input. Then search the alphanumeric-only version for the keyword(s) and use the lookup to map the match positions back to the original input string.
Pseudo-code for building the lookup array:
cleanInput = "";
lookup = [];
lookupIndex = 0;
for ( index = 0; index < input.length; index++ ) {
if ( isAlphaNumeric(input[index]) {
cleanInput += input[index];
lookup[lookupIndex] = index;
lookupIndex++;
}
}

Is it possible to generate a (compact) regular expression for an anagram of an arbitrary string?

Problem: write a program in any language which, given a string of characters, generates a regex that matches any anagram of the input string. For all regexes greater than some length N, The regex must be shorter than the "brute force" solution listing all possible anagrams separated by "|", and the length of the regex should grow "slowly" as the input string grows (ideally linearly, but possibly n ln n).
Can you do it? I've tried, but my attempts are so far from succeeding, that I'm beginning to doubt it's possible. The only reason I ask is I thought I had seen a solution on another site, but much pointless googling failed to uncover it a second time.
I think this javascript code will work according to your specifications. The regex length will increase linearly with the length of the input. It generates a regex which uses positive lookahead to match the anagram of the input string. The lookahead part of regex makes sure all the characters are present in the test input string ignoring their order and the matching part ensures that the length of the test input string is same as the length of the input string (for which regex is constructed).
function anagramRegexGenerator(input) {
var lookaheadPart = '';
var matchingPart = '^';
var positiveLookaheadPrefix='(?=';
var positiveLookaheadSuffix=')';
var inputCharacterFrequencyMap = {}
for ( var i = 0; i< input.length; i++ )
{
if (!inputCharacterFrequencyMap[input[i]]) {
inputCharacterFrequencyMap[input[i]] = 1
} else {
++inputCharacterFrequencyMap[input[i]];
}
}
for ( var j in inputCharacterFrequencyMap) {
lookaheadPart += positiveLookaheadPrefix;
for (var k = 0; k< inputCharacterFrequencyMap[j]; k++) {
lookaheadPart += '.*';
if (j == ' ') {
lookaheadPart += '\\s';
} else {
lookaheadPart += j;
}
matchingPart += '.';
}
lookaheadPart += positiveLookaheadSuffix;
}
matchingPart += '$';
return lookaheadPart + matchingPart;
}
Sample input and output is the following
anagramRegexGenerator('aaadaaccc')
//generates the following string.
"(?=.*a.*a.*a.*a.*a)(?=.*d)(?=.*c.*c.*c)^.........$"
anagramRegexGenerator('abcdef ghij');
//generates the following string.
"(?=.*a)(?=.*b)(?=.*c)(?=.*d)(?=.*e)(?=.*f)(?=.*\s)(?=.*g)(?=.*h)(?=.*i)(?
=.*j)^...........$"
//test run returns true
/(?=.*a)(?=.*b)(?=.*c)(?=.*d)(?=.*e)(?=.*f)(?=.*\s)(?=.*g)(?=.*h)(?=.*i)(?
=.*j)^...........$/.test('acdbefghij ')
//or using the RegExp object
//this returns true
new RegExp(anagramRegexGenerator('abcdef ghij')).test('acdbefghij ')
//this returns false
new RegExp(anagramRegexGenerator('abcdef ghij')).test('acdbefghijj')