Removing keys and values from map - clojure

I have a map that looks like this:
{\a [\h]
\h [\w \w]
\i [\w \h \t]
\p [\t \u \h \a]
\s [\t \a \t \t \i \w \h]
\t [\a]
\u [\t \t \s]
\w []}
I want to remove e.g. \w from both keys and values. i.e. leaving this
{\a [\h]
\h []
\i [\h \t]
\p [\t \u \h \a]
\s [\t \a \t \t \i \h]
\t [\a]
\u [\t \t \s]}
Notice, the \w key has gone and \w has gone from all the values!
Right now I have this, which works, but I'm sure there must be a better, more Clojurey way!
(defn remove-last [last cmap]
(reduce-kv (fn [acc k v]
(if (empty? v)
acc
(into acc {k (vec (filter #(not= % last) v))}))) {} cmap))
The key to remove will always be an empty vector.
How can I do this better?

what i would propose, is to first dissoc the \w key from the map (considering this is ~constant time operation), and then use the transducer to reshape the sequence in one pass without losing the declarative style, and to eliminate verbosity of reduce, while keeping the performance. could look like this:
(into {}
(map (fn [[k v]] [k (filterv #(not= \w %) v)]))
(dissoc data \w))
as for me, i consider this to be more readable, than reduce/assoc or reduce/update

I like specter so much:
(s/setval (s/walker (fn-> (= \w))) s/NONE {\a [\h]
\h [\w \w]
\i [\w \h \t]
\p [\t \u \h \a]
\s [\t \a \t \t \i \w \h]
\t [\a]
\u [\t \t \s]
\w []})

I find your solution quite idiomatic. The requirement is unusual enough, that I immediately think reduce. Your call to empty? is not according to your spec, though. You'd have to test for the key k being = to last.
Also, I wouldn't use the name last here. It clashes with a name that is already present.
A very similar alternative would be
(defn remove-all-of [it m]
(reduce
(fn [acc [k v]]
(if (not= it k)
(assoc acc
k
(into (empty v)
(filter #(not= it %) v)))
acc))
{}
m))
This also allows you to have some other seqable thing than vectors as values by using (empty v).

Related

Regex matching parentheses with =

I'm trying to write a regex to filter out parameters of a handlebars call:
example call:
117-tooltip classes=(concat (concat "productTile__product-availability " classes) " tooltip--small-icon productAvailability__tooltip") bla=(concat "test" "test2")
what my matches should be:
classes=(concat (concat "productTile__product-availability " classes) " tooltip--small-icon productAvailability__tooltip")
bla=(concat "test" "test2")
what my matches currently are:
(concat (concat "productTile__product-availability " classes) " tooltip--small-icon productAvailability__tooltip")
(concat "test" "test2")
my regex:
\((?>[^()]|(?R))*\)
I need to extend it so the structure must be something=(...(...)..) with an unknown number of matching parentheses.
How do I need to extend the regex to get the x= part also into it?
You can use a regex subroutine:
(\w+)=(\(((?>[^()]++|(?2))*)\))
See the regex demo. Details:
(\w+) - Capturing group 1: one or more word chars
= - a = char
(\(((?>[^()]++|(?2))*)\)) - Group 2 (needed for the regex subroutine to work):
\( - ( char
((?>[^()]++|(?2))*) - Group 3: zero or more repetitions of one or more chars other than ( and ) or the whole Group 2 pattern recursed
\) - a ) char.
I would use:
\b\w+=.*?(?=\s+\w+=|$)
Demo
The idea behind this pattern is to match a key= followed by all content leading up to, but not including, either the next key, or the end of the input.
Explanation:
\b\w+= match a KEY=
.*? match all content up, but not including
(?=\s+\w+=|$) assert that what follows is one or more
whitespace characters followed by KEY= OR
the end of the input

regex to match a word and ignore the line if it contains another word

needs to match the lines with [m] and ignore the lines with [F]
regex needs to select lines 1& 2 only
1.[m]dfsd
2.[M]
3.[M]dfdfd[F]
4.[M]dfsd[f]
5.[m]dfd[F]
6.[m]fsdf[f]
tried this
(?=.[m])(?!=.[f])
Your attempt to use lookaheads is on the right track. I would use a negative lookahead to exclude [F], and then match [m] anywhere in the pattern:
^(?!.*\[[Ff]\]).*\[[Mm]\].*$
Demo
Note that I used [Mm] and [Ff] to match male and female as case insensitive.

Forcing regex to ignore detection depending on preposition

I'm trying to build a regex, which will detect usernames mentioned in a string. The usernames can look like "username", "username[0-9]", "adm-username", "adm-username[0-9]".
As of now, I have this: \b(adm\-){0,1}username[0-9]{0,1}\b (link: https://regexr.com/4at34)
The problem is with adm-. If the preposition is aadm-username, the regex still detects 'username', I want it to fail. Any tips how to do that?
Thanks
You could replace \b by [\w-] in your case.
Also, don't match the boundaries.
And finally, don't match intermediate groups, make a single big group for your matches.
Demo
(?<![\w-])((?:adm-)?username\d?)(?![\w-])
[v] username
[v] username2
[v] adm-username
[v] adm-username2
[x] aadm-username
[x] aadm-username2
Explanation
(?<![\w-]) # negative lookbehind, only match if no word character or hyphen is present
(
(?:adm-)? # non-matching group containing adm- literally once or none, will be matched in the greater group
username\d? # literally matching username and a digit, once or none
)
(?![\w-]) # negative lookahead, only match if no word character or hyphen is present

Regex Camel Case only first Uppercase by group

This is my current regex:
/([A-Z])(?![A-Z])/gm
Bellow how it's evaluated:
https://regex101.com/r/K1gvmr/1
In the print you can see I'm getting:
[F] oo [B] ar XYBA [Z]
And instead I need to get that matches:
[F] oo [B] ar [X] YBAZ
How a negative lookahead (or another approach) can stop evaluation in the first char of each group only?
Try Regex: (?<![A-Z])[A-Z]
Explanation:
A negative look behind for A-Z followed by any character in A-Z
Demo

Regular Expression nongreedy is greedy

I have the following text
tooooooooooooon
According to this book I'm reading, when the ? follows after any quantifier, it becomes non greedy.
My regex to*?n is still returning tooooooooooooon.
It should return ton shouldn't it?
Any idea why?
A regular expression can only match a fragment of text that actually exists.
Because the substring 'ton' doesn't exist anywhere in your string, it can't be the result of a match. A match will only return a substring of the original string
EDIT: To be clear, if you were using the string below, with an extra 'n'
toooooooonoooooon
this regular expression (which doesn't specify 'o's)
t.*n
would match the following (as many characters as possible before an 'n')
toooooooonoooooon
but the regular expression
t.*?n
would only match the following (as few characters as possible before an 'n')
toooooooon
A regular expression es always eager to match.
Your expression says this:
A 't', followed by *as few as possible* 'o's, followed by a 'n'.
That means any o's necessary will be matched, because there is an 'n' at the end, which the expression is eager to reach. Matching all the o's is it's only possibility to succeed.
Regexps try to match everything in them. Because there are no less 'o's to match than every o in toooon to match the n, everything is matched. Also, because you are using o*? instead of o+? you are not requiring an o to be present.
Example, in Perl
$a = "toooooo";
$b = "toooooon";
if ($a =~ m/(to*?)/) {
print $1,"\n";
}
if ($b =~ m/(to*?n)/) {
print $1,"\n";
}
~>perl ex.pl
t
toooooon
The Regex always does its best to match. The only thing you are doing in this case would be slowing your parser down, by having it backtrack into the /o*?/ node. Once for every single 'o' in "tooooon". Whereas with normal matching, it would take as many 'o's, as it can, the first time through. Since the next element to match against is 'n', which won't be matched by 'o', there is little point in trying to use minimal matching. Actually, when the normal matching fails, it would take quite a while for it to fail. It has to backtrack through every 'o', until there is none left to backtrack through. In this case I would actually use maximal matching /to*+n/. The 'o' would take all it could, and never give any of it back. This would make it so that when it fails it fails quickly.
Minimal RE succeeding:
'toooooon' ~~ /to*?n/
t o o o o o o n
{t} match [t]
[t] match [o] 0 times
[t]<n> fail to match [n] -> retry [o]
[t]{o} match [o] 1 times
[t][o]<n> fail to match [n] -> retry [o]
[t][o]{o} match [o] 2 times
[t][o][o]<n> fail to match [n] -> retry [o]
. . . .
[t][o][o][o][o]{o} match [o] 5 times
[t][o][o][o][o][o]<n> fail to match [n] -> retry [o]
[t][o][o][o][o][o]{o} match [o] 6 times
[t][o][o][o][o][o][o]{n} match [n]
Normal RE succeeding:
(NOTE: Similar for Maximal RE)
'toooooon' ~~ /to*n/
t o o o o o o n
{t} match [t]
[t]{o}{o}{o}{o}{o}{o} match [o] 6 times
[t][o][o][o][o][o][o]{n} match [n]
Failure of Minimal RE:
'toooooo' ~~ /to*?n/
t o o o o o o
. . . .
. . . .
[t][o][o][o][o]{o} match [o] 5 times
[t][o][o][o][o][o]<n> fail to match [n] -> retry [o]
[t][o][o][o][o][o]{o} match [o] 6 times
[t][o][o][o][o][o][o]<n> fail to match [n] -> retry [o]
[t][o][o][o][o][o][o]<o> fail to match [o] 7 times -> match failed
Failure of Normal RE:
'toooooo' ~~ /to*n/
t o o o o o o
{t} match [t]
[t]{o}{o}{o}{o}{o}{o} match [o] 6 times
[t][o][o][o][o][o][o]<n> fail to match [n] -> retry [o]
[t][o][o][o][o][o] match [o] 5 times
[t][o][o][o][o][o]<n> fail to match [n] -> retry [o]
. . . .
[t][o] match [o] 1 times
[t][o]<o> fail to match [n] -> retry [o]
[t] match [o] 0 times
[t]<n> fail to match [n] -> match failed
Failure of Maximal RE:
'toooooo' ~~ /to*+n/
t o o o o o o
{t} match [t]
[t]{o}{o}{o}{o}{o}{o} match [o] 6 times
[t][o][o][o][o][o][o]<n> fail to match [n] -> match failed
The string you are searching in (the haystack as it were) does not contain the substring "ton".
It does however contain the substring "tooooooooooooon".