POSIX regex for Version string - regex

I have a source on which I have no control and I want to filter out all string which have some characters in it.
For example Out of these:
9
8.1.0
5.0
9.0
5.1
8.0.0
7.0 (cdfsdsdsd)
5.0.2
8
7.0.1
7.1
6.0
7.0
Over 32323
7.0 rdx K9 bnsm
9.2.3
8.oo
pp
unknown
8.0_vgfe10051988
6.0.1
8.0.0-derv10051988
9.1
9.0.0
8.0.1
7.0_xccv10051988
7.1.3
10.0
7.0.X.1.C
8.0.0_vged10051988
4.4.4
7.1.2
7.0 [NKL 24 | ABC]
8.1
7.1.1
5.1.1
7.0_Jgrd10051988
9.XXX
9.0.1
8.0
5.0.1
8.1.1
10
Out of these I need only those Strings with only digits and .
9
8.1.0
5.0
9.0
5.1
8.0.0
5.0.2
8
7.0.1
7.1
6.0
7.0
9.2.3
6.0.1
9.1
9.0.0
8.0.1
7.1.3
10.0
4.4.4
7.1.2
8.1
7.1.1
5.1.1
9.0.1
8.0
5.0.1
8.1.1
10
I have tried many regex, but nothing seems to be generic enough,
This regex is giving [0-9]*.?[0-9] Strings too.
The one I have got working is ^(\*|\d+(\.\d+){0,2}(\.\*)?)$, but this is not POSIX.
How do I get a POSIX which also works on Redshift?

By taking a look of the Amazon document, it seems POSIX ERE is supported by the Redshift. Then would you please try:
^[[:digit:]]+(\.[[:digit:]]+)*$

Your regex works, you just need to use double backslashes in the string literal.
According to the Amazon Redshift "POSIX Operators" documentation,
Amazon Redshift supports the following Perl-influenced operators in regular expressions. Escape the operator using two backslashes (‘\\’).
So, you may use
'^(\\*|\\d+(\\.\\d+){0,2}(\\.\\*)?)$'

The simplest is:
^[.0-9]+$
If you don't have support for extended regex, you can do:
^[.0-9][.0-9]*$
I ran this command on your input and output and got an empty diff:
$ diff <(grep -P '^[.0-9]+$' input) output
$ echo $?
0
On your specific input, even ^[.0-9]*$ would work.
Note, however, that there's a difference between "Strings with only digits and ." and "version string". The simple regex will also catch inputs like:
1..2
..
.
0...
.1
If that's not a problem, you can use the simple regex.

Related

QMake not matching regex for distro detection

I have a C++ project using QMake. I'm trying to set some compiler options based on a simple test of which Linux distro is running, but the test does not pass. My qmake file contains:
OSDISTRO = $$(cat /proc/version)
contains(OSDISTRO, "Ubuntu"): {
message(Found ubuntu)
}
I tested the regex from the command line and it works!
cat /proc/version | pcregrep "Ubuntu"
Linux version 4.18.0-20-generic (buildd#lcy01-amd64-020) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #21~18.04.1-Ubuntu SMP Wed May 8 08:43:37 UTC 2019
Is there something special about the regex syntax in qmake? Any reason why this isn't working?
For the RegEx:
This works for me:
OSDISTRO = $$system(cat /proc/version)
contains(OSDISTRO, .*Ubuntu.*){
message("Found Ubuntu")
}
Note:
The match is case sensitive.
You can use .*[uU]buntu.* for example to match ubuntu and Ubuntu.
Explanation why your solution does not work:
The QMake function contains works with lists of values.
So, the execution of you solution will be like this:
1) First instruction OSDISTRO = $$(cat /proc/version):
QMake will execute $$system(cat /proc/version).
Then the result is splitted (by space as separator) to list of values. OSDISTRO will contain this list.
Assuming that the result is as yours. The result of the first instruction is like this:
OSDISTRO = "Linux" "version" "4.18.0-20-generic"....
2) Second instruction contains(OSDISTRO, "Ubuntu") : message(Found ubuntu):
QMake will search if the variable OSDISTRO contains the value Ubuntu and display the message Found ubuntu if success.
Here in your case, QMake will never find Ubuntu, cause the value which contains it is like this (Ubuntu 7.3.0-16ubuntu3) and QMake search only the value Ubuntu.
Hope it helps you.

Value of `__GLIBCXX__` for each libstdc++ release

The macro __GLIBCXX__ contains the time stamp of libstdc++ releases, e.g., from gcc documentation (https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_macros.html)
__GLIBCXX__
The current version of libstdc++ in compressed ISO date format, as an unsigned long. For details on the value of this particular macro for a particular release, please consult the ABI Policy and Guidelines appendix.
I am looking for the values for all releases since the release of 4.9.0 (including releases of smaller versions like 4.8.x).
The documentation of libstdc++ does not seem to provide this information (it only provides the dates up to gcc 4.7.0).
Where can I find the values of __GLIBCXX__? Does anybody have them?
The ABI Policy and Guidelines appendix (https://gcc.gnu.org/onlinedocs/libstdc++/manual/abi.html) says
Incremental bumping of a library pre-defined macro. For releases before 3.4.0, the macro is GLIBCPP. For later releases, it's GLIBCXX. (The libstdc++ project generously changed from CPP to CXX throughout its source to allow the "C" pre-processor the CPP macro namespace.) These macros are defined as the date the library was released, in compressed ISO date format, as an unsigned long.
but then only provides the values of the macro up to GCC 4.7.0. Still the day of a particular GCC releases are listed here:
https://gcc.gnu.org/releases.html
but for example for GCC 4.9.1 with release date "July 16, 2014" the ISO date format is 20140716 and the value of __GLIBCXX__ is 20140617 (notice the 7 and 6 have been switched).
The information you want is useless anyway, so you should solve your problem a different way.
GCC 4.9.3 was released after GCC 5.3, so it has a later date in that macro, so you can't just do something like:
#if __GLIBCXX__ > 20150422 // GCC 5.1 release
because that would be true for 4.9.3, but that doesn't have all the features that 5.1 has.
Most GNU/Linux distros don't ship official FSF releases either, they build snapshots, which will have the date of the snapshot, which won't be in any list of release dates. And a snapshot from the 5.x branch on a given day will have the same date as a snapshot from the 6.x branch on a given day, so you can't tell them apart.
In the interest of answering the original question, here's a hacky command you can execute in your shell to get the list of releases and the value of __GLIBCXX__ for each release (starting with v4.1.0):
svn list "svn://gcc.gnu.org/svn/gcc/tags" | grep -o "gcc_\([^34]_.*\|4_[^0]_.*\)_release" | xargs -n 1 -I {} sh -c "printf \"{}: \" && svn cat svn://gcc.gnu.org/svn/gcc/tags/{}/gcc/DATESTAMP"
The results are:
4.1.0: 20060228
4.1.1: 20060524
4.1.2: 20070214
4.2.0: 20070514
4.2.1: 20070719
4.2.2: 20071007
4.2.3: 20080201
4.2.4: 20080519
4.3.0: 20080305
4.3.1: 20080606
4.3.2: 20080827
4.3.3: 20090124
4.3.4: 20090804
4.3.5: 20100522
4.3.6: 20110627
4.4.0: 20090421
4.4.1: 20090722
4.4.2: 20091015
4.4.3: 20100121
4.4.4: 20100429
4.4.5: 20101001
4.4.6: 20110416
4.4.7: 20120313
4.5.0: 20100414
4.5.1: 20100731
4.5.2: 20101216
4.5.3: 20110428
4.5.4: 20120702
4.6.0: 20110325
4.6.1: 20110627
4.6.2: 20111026
4.6.3: 20120301
4.6.4: 20130412
4.7.0: 20120322
4.7.1: 20120614
4.7.2: 20120920
4.7.3: 20130411
4.7.4: 20140612
4.8.0: 20130322
4.8.1: 20130531
4.8.2: 20131016
4.8.3: 20140522
4.8.4: 20141219
4.8.5: 20150623
4.9.0: 20140422
4.9.1: 20140716
4.9.2: 20141030
4.9.3: 20150626
5.1.0: 20150422
5.2.0: 20150716
5.3.0: 20151204
6.1.0: 20160427
6.2.0: 20160822
6.3.0: 20161221
6.4.0: 20170704
7.1.0: 20170502
7.2.0: 20170814
7.3.0: 20180125
Note that these values are from the official releases from the GCC team. If you're using an unofficial release, the values might differ slightly.
You can generate a list of possible __GLIBCXX__ values using the SVN release listing as source:
svn list --xml 'https://gcc.gnu.org/svn/gcc/tags' \
| grep '>gcc.*release' -A4 \
| grep 'name\|date' \
| sed -e 's/<[^>]\+>//g' -e 's/T.*$//' -e 's/-//g' \
-e 's/gcc_\|_release//g' \
| paste - -
A similar list, but more free-form and annotate with branching ascii art is maintained by the GCC team:
https://gcc.gnu.org/develop.html#timeline
Note that multiple release branches are active in parallel, cf e.g. the 4.8 and 4.9 branches:
4_8_0 20130322
4_8_1 20130531
4_8_2 20131016
4_8_3 20140522
4_8_4 20141219
4_8_5 20150623
4_9_0 20140422
4_9_1 20140716
4_9_2 20141030
4_9_3 20150626
4_9_4 20160803
Thus, unfortunately, you can't use a single date as simple cut-off value to determine a certain release.
Of course, you can auto-generate some helper macros from this list. Say - you need some workaround for the 4.8 GLIBCXX release (as used by GCC and different clang versions) then you could define a helper macro like this (after including some STL header):
#if __GLIBCXX__ == 20130322 \
|| __GLIBCXX__ == 20130531 \
|| __GLIBCXX__ == 20131016 \
|| __GLIBCXX__ == 20140522 \
|| __GLIBCXX__ == 20141219 \
|| __GLIBCXX__ == 20150623
#define HAVE_GLIBCXX_4_8 1
#else
#define HAVE_GLIBCXX_4_8 0
#endif
If you are just interested in the major version and only need to support releases newer than GCC 7 than you can also use the _GLIBCXX_RELEASE macro.

Detection clang on different platforms

Faced with the problem of parsing version of clang for different vendors clang --version|head -1
Apple LLVM version 6.0 (clang-600.0.54) (based on LLVM 3.5svn) => 3.5
FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512 => 3.4.1
clang version 3.5.0 (tags/RELEASE_350/rc2) => 3.5.0
Now have this regular expression
match: (clang version|based on LLVM)\ (\d+)(\.\d+)(\.\d+)?
to \2\3\4
I need to exclude (clang version|based on LLVM) from the result match() - \1.
Your question looks like you're expecting CMake's regex handling to be Perl-like, but it's pretty different.
In CMake syntax, the following should do what you want:
if("${CMAKE_CXX_COMPILER_ID}" STREQUAL "Clang")
execute_process(COMMAND clang --version COMMAND head -1 OUTPUT_VARIABLE ClangVersionLine)
string(REGEX MATCH "(clang version|based on LLVM) ([0-9]\\.[0-9]\\.?[0-9]?)" UnusedVar "${ClangVersionLine}")
set(ClangVersion "${CMAKE_MATCH_2}")
message("ClangVersion - ${ClangVersion}")
endif()
Ok, since the tool you use is based on PCRE, this pattern should work:
(?m)(?:^Apple .*)?\K \d+(?:\.\d+){1,2}
The (?m) is not useful in real life because you only test one line, so you can remove it.
The \K removes all that have been matched on the left from the match result.
If you use another language/tool where the \K feature is not available you can use a capturing group to extract the information you want from the whole match result:
(?:^Apple .*)? ([0-9]+(?:[.][0-9]+){1,2})
With cmake:
string(REGEX REPLACE "(^Apple .*)? ([0-9]+([.][0-9]+){1,2}).*" "\\2" CLANG_VERSION "${_clang_version_info}")

PowerShell Regex to find " Version 12.3 "

In a string such as:
CustomerDisplayVersionNumber : Version 12.3 (build 567.89)
How can I, using a regex, return true if the version number is exactly 12.3
There is the slight, but real possibility that the version number might contain a service pack version. For example
CustomerDisplayVersionNumber : Version 12.3.4 (build 567.89)
Which leads me to believe it will be safer to check for [space]Version 12.3.nnnn[space].
Regex "\sVersion (\d+\.\d+(\.\d+)?)\s" will satisfy all provided examples
#('CustomerDisplayVersionNumber : Version 12.3 (build 567.89)', 'CustomerDisplayVersionNumber : Version 12.3.4 (build 567.89)', 'CustomerDisplayVersionNumber : Version 12.3.9999 (build 567.89)') | % { [regex]::Match($_, "\sVersion (\d+\.\d+(\.\d+)?)\s").Success }
use this regex \s+Version \d+\.\d+\.\d+\s+

Perl Regex to match alphanumeric

I am trying to kernel & gcc version by reading /proc/version file using Perl. But I am not sure but the regex to match the versions. I tried something like this
/Linux version (\d+)*gcc version(\d+)*/
But its not working. Thanks in advance. I am newbie to Perl. And the contents of version is
Linux version 2.6.32-21-generic (buildd#rothera) (gcc version 4.4.3
(Ubuntu 4.4.3-4ubuntu5) ) #32-Ubuntu SMP Fri Apr 16 08:10:02 UTC 2010
Try this:
$version =~ /Linux version ([\d.-]+)-\D.*gcc version ([\d.]+) /;
print "Linux version: $1\ngcc version: $2\n";
The output:
Linux version: 2.6.32-21
gcc version: 4.4.3
This regex will work for you
/Linux version ([\w.-]*).*?gcc version ([\w.]*)/
First captured group will have linux version 2.6.32-21-generic and second captured group will have gcc version 4.4.3. If you do not want to capture generic then use \d instead of \w.
m{Linux version\s(\S+).*gcc version\s(\S+)} and print "$1\n$2\n"