What is this error? - FAILED DURING THE BUILDING PHASE - regex

I got this error while building:
dist/package.conf.inplace:
inappropriate type
FAILED DURING THE BUILDING PHASE. The **exception** was: ExitFailure 1
How do I use subRegex in package Text.Regex?
I have written:
import Text.Regex.Posix
But I got this error:
_.hs:13:5: Not in scope: ‘subRegex’
_.hs:13:15:
Not in scope: ‘mkRegex’
Perhaps you meant ‘makeRegex’ (imported from Text.Regex.Posix)
So, I went to Text.Regex's [page][1], and there it said:
Uses the POSIX regular expression interface in Text.Regex.Posix.
So why not aren't these functions in-scope?

Here are some steps you can perform to make it working.
Download from http://hackage.haskell.org/package/regex-compat-0.92, unzip to <Haskell Platform INSTALL FOLDER>\2014.2.0.0\lib\
Run Haskell.
Type :mod +Text.Regex to load the package.
Type, e.g. subRegex (mkRegex "[0-9]+") "foobar567" "123"
Result is "foobar123" (after all packages are loaded).
Here is the subRegex description:
:: Regex Search pattern
-> String Input string
-> String Replacement text
-> String Output string
Replaces every occurance of the given regexp with the replacement
string.
In the replacement string, "\1" refers to the first substring; "\2" to
the second, etc; and "\0" to the entire match. "\\" will insert a
literal backslash.
This does not advance if the regex matches an empty string. This
misfeature is here to match the behavior of the the original
Text.Regex API.
Some cool links that can help you delve deeper:
http://www.serpentine.com/blog/2007/02/27/a-haskell-regular-expression-tutorial/, and
https://wiki.haskell.org/Cookbook/Pattern_matching.
I am using it in Windows, here is my screen:

You shouldn't import Text.Regex.Posix, but rather just Text.Regex, because the two functions you want are there.
Have a look at the Hackage page - you were almost there, but the functions where actually in that file.

Related

How can I create a Regex that matches and transforms a period delimited path?

I am using den4b Renamer to rename a lot of files that follow a specific pattern. The program allows me to use RegEx: (https://www.den4b.com/wiki/ReNamer:Regular_Expressions)
I am stuck trying to conjure up an expression for a specific pattern.
My current RegEx:
Expression: ^(com\.)(([\w\s]*\.){0,4})([\w\s]*)$
Replace: \L$1\L$2\u$4
Note: \L and \u transform the sub-expression to upper and lower case as defined in the table below:
Here are a few example strings so you can get an idea of the input:
Android File Transfer.svg
Angular Console.svg
Au.Edu.Uq.Esys.Escript.svg
Avidemux.svg
Blackmagic Fusion8.svg
Broken Sword.svg
Browser360 Beta.svg
Btsync GUI.svg
Buttercup Desktop.svg
Calc.svg
Calibre EBook Edit.svg
Calibre Viewer.svg
Call Of Duty.svg
com.GitHub.Plugarut.Pwned Checker.svg
com.GitHub.Plugarut.Wingpanel Monitor.svg
com.GitHub.Rickybas.Date Countdown.svg
com.GitHub.Spheras.Desktopfolder.svg
com.GitHub.Themix Project.Oomox.svg
com.GitHub.Unrud.Remote Touchpad.svg
com.GitHub.Unrud.Video Downloader.svg
com.GitHub.Weclaw1.Image Roll.svg
com.GitHub.Zelikos.Rannum.svg
com.Gitlab.Miridyan.Mt.svg
com.Inventwithpython.Flippy.svg
com.Neatdecisions.Detwinner.svg
com.Rafaelmardojai.Share Preview.svg
com.Rafaelmardojai.Webfont Kit Generator.svg
Distributor Logo Antix.svg
Distributor Logo Archlabs.svg
Distributor Logo Dragonflybsd.svg
DOSBox.svg
Drawio.svg
Drweb GUI.svg
For this question I am focused on the strings that begin with com.xxx.xxx.
Since I can't only target those names in Renamer, the expression has to "play nice" with the other input file names and correctly leave them alone. That's why I've prefixed my expression with ^(com\.)
What I want:
Transform the entire string to lower case except for the last period separated part of the string.
Strip white space from the entire string.
For instance:
Original: com.GitHub.Alcadica.Develop.svg
After my Regex: com.github.alcadica.Develop.svg
What I want: com.github.alcadica.Develop.svg
This specific file is correctly renamed. What I'm having trouble with are names that have spaces in any part of the string. I can't figure out how to strip whitespace:
Original: com.Belmoussaoui.Read it Later.svg
After my Regex: com.belmoussaoui.Read it Later.svg
What I want: com.belmoussaoui.ReaditLater.svg
Here is a hypothetical example because I couldn't find a file with more than four parts. I want my pattern to be robust enough to handle this:
Original: com.Shatteredpixel.Another Level.Next.Pixel Dungeon.svg
After my Regex: com.shatteredpixel.another level.next.Pixel Dungeon.svg
What I want: com.shatteredpixel.anotherlevel.next.PixelDungeon.svg
Note that since I'm not using any kind of programming language, I don't have access to common string operations like trim, etc. I can, however, stack expressions. But this would create more overhead and since I am renaming thousands of files at a time I'd ideally like to keep it to one find/replace expression.
Any help would be greatly appreciated. Please let me know if I can provide any more information to make this more clear.
Edit:
I got it to work with the following rules:
Really inefficient, but it works. (Thanks to Jeremy in the comments for the idea)

OCaml: How to remove all non-alphabetic characters from a string?

How do I remove all the non-alphabetic characters from a string?
E.g.
"Wë_1ird?!" -> "Wëird"
In Perl, I'd do this with =~ s/[\W\d_]+//g. In Python, I'd use
re.sub(ur'[\W\d_]+', u'', u"Wë_1ird?!", flags=re.UNICODE)
Etc.
AFAICT, Str.regex does not support \W, \d, etc. (I can't
tell whether it supports Unicode, but somehow I doubt it).
Str doesn't support Unicode. Assuming you are dealing with UTF-8 encoded data. You can use Uutf and Uucp as follows:
let keep_alpha s =
let b = Buffer.create 255 in
let add_alpha () _ = function
| `Malformed _ -> Uutf.Buffer.add_utf_8 b Uutf.u_rep
| `Uchar u -> if Uucp.Alpha.is_alphabetic u then Uutf.Buffer.add_utf_8 b u
in
Uutf.String.fold_utf_8 add_alpha () s;
Buffer.contents b
# keep_alpha "Wë_1ird?!";;
- : string = "Wëird"
I'm not an expert in regexes and utf, but if I were in your shoes, then I would use re2 library, and this is my first approximation:
open Core.Std
open Re2.Std
open Re2.Infix
let drop _match = ""
let keep_alpha s = Re2.replace ~/"\\PL" ~f:drop s
The first three lines open libraries and bring their definitions into scope. You do not need to open library to use it, but otherwise you need to prefix each defintion. OCaml core library is specially designed in a such way, that a user should open Std submodule to bring all necessary defintions to scope. Re2 library is from the same guys and have a consisten conventions. open Re2.Infix will bring infix (and prefix operators) to scope, namely ~/ that will create a regex from a string. The drop function just ignores its argument and returns an empty string. I've prefixed parameter with an underscore, since it is a convention for unused parameteers (respected by a compiler). You can also use just a plain uderscore, as a wild card instead, like let drop _ = "". Next is keep_alpha function that will substitute any utf symbol that doesn't match a utf letter class with an empty string, i.e., remove it from the output.
Update
I've checked my code, and fixed errors. Also, I would like to show, how to play with this code in toplevel. You've several options, but the easiest is to use coretop script that ships with core library. It uses utop toplevel, so make sure that you have installed it:
$ opam install -y utop
Once, it is done, you can start toplevel:
$ coretop -require re2
this -require re2 flag will automatically find and load re2 library to your toplevel. You can load additional libraries without restarting utop with the following command:
# #require "libname";;
The first # is a toplevel's prompt, you shouldn't type it, but the second is a start of directive, so make sure that actually type it. Any directive should be started from # symbol. There're other useful directives in utop, namely:
# #use "filename.ml";; (* will load and evaluate filename.ml *)
# #list;; (* will list all available packages *)
# #typeof "keep_alpha";; (* will infer and print type of expression *)
Toplevel will not evaluate your code until you terminate it with ;; sequence. You may sometimes see this ugly ;; in a real code, but it is not needed, it is just to say the toplevel, that you want it to evaluate your code right at this place, and show you the result.

Emacs occur mode search for multiple strings

I have a plain text file with multiple patterns. Example:
DEBUG: i'm a debug line
DEBUG: Another 1
ERROR: this was an error
DEBUG: Another 2
NORMAL: EMACS
DEBUG: Another 3
ERROR: another error
The idea is to use occur-mode to filter the text file with the patterns i want. Example: DEBUG and ERROR.
As far as i understood occur only works with single string entry or regex.
How can i use the occur mode to filter more than one string pattern ? If there is another emacs mode to filter strings in text i also accept.
You can pass a regexp that matches either of the strings to occur. E.g., type M-x occur RET DEBUG\|ERROR.
If it is a pattern you often use, here's a bit of elisp (based on legoscia's answer):
(defun myoccur (arg)
(interactive "sList of space-separated args: ")
(occur (s-replace " " "\\|" arg))
)
it replaces the whitespaces with the OR regexp construct and calls occur.
ps: s-replace is not standard. You need (require 's), the s.el library. https://github.com/magnars/s.el

How to replace characters in string Erlang?

I have this piece of code that gets sessionid, make it a string, and then create a set with key as e.g. {{1401,873063,143916},<0.16443.0>} in redis. I'm trying replace { characters in this session with letter "a".
OldSessionID= io_lib:format("~p",[OldSession#session.sid]),
StringForOldSessionID = lists:flatten(OldSessionID),
ejabberd_redis:cmd([["SADD", StringForSessionID, StringForUserInfo]]);
I've tried this:
re:replace(N,"{","a",[global,{return,list}]).
Is this a good way of doing this? I read that regexp in Erlang is not a advised way of doing things.
Your solution works, and if you are comfortable with it, you should keep it.
On my side I prefer list comprehension : [case X of ${ -> $a; _ -> X end || X <- StringForOldSessionID ]. (just because I don't have to check the function documentation :o)
re:replace(N,"{","a",[global,{return,list}]).
Is this a good way of doing this? I read that regexp in Erlang is not
a advised way of doing things.
According to official documentation:
2.5 Myth: Strings are slow
Actually, string handling could be slow if done improperly. In Erlang, you'll have to think a little more about how the strings are used and choose an appropriate representation and use the re module instead of the obsolete regexp module if you are going to use regular expressions.
So, either you use re for strings, or:
leave { behind(using pattern matching)
if, say, N is {{1401,873063,143916},<0.16443.0>}, then
{{A,B,C},Pid} = N
And then format A,B,C,Pid into string.
Since Erlang OTP 20.0 you can use string:replace/3 function from string module.
string:replace/3 - replaces SearchPattern in String with Replacement. 3rd function parameter indicates whether the leading, the trailing or all encounters of SearchPattern are to be replaced.
string:replace(Input, "{", "a", all).

StackOverflowError with Checkstyle 4.4 RegExp check

Hello,
Background:
I'm using Checkstyle 4.4.2 with a RegExp checker module to detect when the file name in out java source headers do not match the file name of the class or interface in which they reside. This can happen when a developer copies a header from one class to another and does not modify the "File:" tag.
The regular expression use in the RexExp checker has been through many incarnations and (though it is possibly overkill at this point) looks like this:
File: (\w+)\.java\n(?:.*\n)*?(?:[\w|\s]*?(?: class | interface )\1)
The basic form of files I am checking (though greatly simplified) looks like this
/*
*
* Copyright 2009
* ...
* File: Bar.java
* ...
*/
package foo
...
import ..
...
/**
* ...
*/
public class Bar
{...}
The Problem:
When no match is found, (i.e. when a header containing "File: Bar.java" is copied into file Bat.java ) I receive a StackOverflowError on very long files (my test case is #1300 lines).
I have experimented with several visual regular expression testers and can see that in the non-matching case when the regex engine passes the line containing the class or interface name it starts searching again on the next line and does some backtracking which probably causes the StackOverflowError
The Question:
How to prevent the StackOverflowError by modifying the regular expression
Is there some way to modify my regular expression such that in the non-matching case (i.e. when a header containing "File: Bar.java" is copied into file Bat.java ) that the matching would stop once it examines the line containing the interface or class name and sees that "\1" does not match the first group.
Alternatively if that can be done, Is is possible minimize the searching and matching that takes place after it examines the line containing the interface or class thus minimizing processing and (hopefully) the StackOverflow error?
Try
File: (\w+)\.java\n.*^[\w \t]+(?:class|interface) \1
in dot-matches-all mode. Rationale:
[\w\s] (the | doesn't belong there) matches anything, including line breaks. This results in a lot of backtracking back up into the lines that the previous part of the regex had matched.
If you let the greedy dot gobble up everything up to the end of the file (quick) and then backtrack until you find a line that starts with words or spaces/tabs (but no newlines) and then class or interface and \1, then that doesn't require as much stack space.
A different, and probably even better solution would be to split the problem into parts.
First match the File: (\w+)\.java part. Then do a second search with ^[\w \t]+(?:class|interface) plus the \1 match from the first search on the same file.
Follow up:
I plugged in Tim Pietzcher's suggestion above and his greedy solution did indeed fail faster and without a StackOverflowError when no match was found. However, in the positive case, the StackOverflowError still occurred.
I took a look at the source code RegexpCheck.java. The classes pattern is constructed in multiline mode such that the expressions ^ and $ match just after or just before, respectively, a line terminator or the end of the input sequence. Then it reads the entire class file into a string and does a recursive search for the pattern(see findMatch()). That is undoubtedly the source of the StackOverflowException.
In the end I didn't get it to work (and gave up) Since Maven 2 released the maven-checkstyle-plugin-2.4/Checkstyle 5.0 about 6 weeks ago we've decided to upgrade our tools. This may not solve the StackOverflowError problem, but it will give me something else to work on until someone decides that we need to pursue this again.