How to concatenate fields in select statements using Doctrine - doctrine-orm

I'm wondering how to concatenate two fields in DQL select statement with some literal between.
I have this for now but no luck...
$qb
->select('season.id, concat(competition.name, '-',season.name) AS specs')
->leftJoin('season.competition', 'competition')
->where('season.name LIKE :q')
->setParameter('q', '%'.$q.'%')
->setMaxResults($p)
;

We cannot send three arguments here, but we can do it like this,
$em = \Zend_Registry::get('em');
$qb_1 = $em->createQueryBuilder();
$q_1 = $qb_1->select( "reprt_abs.id" )
->addSelect( "CONCAT( CONCAT(reporter.firstname, ' '), reporter.lastname)" )
->from( '\Entities\report_abuse', 'reprt_abs' )
->leftJoin( 'reprt_abs.User', 'reporter' )
->getQuery()->getResult();
This part is that what you want:
$qb_1->select( "reprt_abs.id" )
->addSelect( "CONCAT( CONCAT(reporter.firstname, ' '), reporter.lastname)" )
Following is the output at my side:
array (size=19)
0 =>
array (size=2)
'id' => int 1
1 => string 'Jaskaran Singh' (length=14)
1 =>
array (size=2)
'id' => int 9
1 => string 'Harsimer Kaur' (length=14)
2 =>
array (size=2)
'id' => int 12
1 => string 'Jaskaran Singh' (length=14)
3 =>
array (size=2)
'id' => int 16
1 => string 'Jaskaran Singh' (length=14)
4 =>
array (size=2)
'id' => int 19
1 => string 'Jaskaran Singh' (length=14)
5 =>
array (size=2)
'id' => int 4
1 => string 'shilpi jaiswal' (length=14)

Solution I use in Doctrine 2.4+ on MySQL database PDO Platform:
$concat = new Query\Expr\Func('CONCAT', $name[$k]);
$concat .= ' as ' . $k;
$concat = str_replace(',', ',\' \',', $concat);
$this->query->addSelect($concat);
So $name[$k] is an array of fields, as many as you wish. I then add some spacing between the fields with the str_replace. $k is the name of the concat field, so the result of $concat is
"CONCAT(p.email,' ', h.phoneNumber,' ', p.officialName) as details"

Related

How to extract the year via regex from a string in Ruby

I'm trying to extract the year from a string with this format:
dataset_name = 'ALTVALLEDAOSTA000020191001.json'
I tried:
dataset_name[/<\b(19|20)\d{2}\b>/, 1]
/\b(19|20)\d{2}\b/.match(dataset_name)
I'm still reading the docs but so far I'm not able to achieve the result I want. I'm really bad at regex.
Since your dataset name always ends in yyyymmdd.json, you can take a slice of the last 13-9 characters counting from the rear:
irb(main):001:0> dataset_name = 'ALTVALLEDAOSTA000020191001.json'
irb(main):002:0> dataset_name[-13...-9]
=> "2019"
You can also use a regex if you want a bit more precision:
irb(main):003:0> dataset_name =~ /(\d{4})\d{4}\.json$/
=> 18
irb(main):004:0> $1
=> "2019"
There are many ways to get to Rome.
Starting with:
foo = 'ALTVALLEDAOSTA000020191001.json'
Stripping the extended filename + extension to its basename then using a regex:
ymd = /(\d{4})(\d{2})(\d{2})$/
ext = File.extname(foo)
File.basename(foo, ext) # => "ALTVALLEDAOSTA000020191001"
File.basename(foo, ext)[ymd, 1] # => "2019"
File.basename(foo, ext)[ymd, 2] # => "10"
File.basename(foo, ext)[ymd, 3] # => "01"
Using a regex against the entire filename to grab just the year:
ymd = /^.*(\d{4})/
foo[ymd, 1] # => "1001"
or extracting the year, month and day:
ymd = /^.*(\d{4})(\d{2})(\d{2})/
foo[ymd, 1] # => "2019"
foo[ymd, 2] # => "10"
foo[ymd, 3] # => "01"
Using String's unpack:
ymd = '#18A4'
foo.unpack(ymd) # => ["2019"]
or:
ymd = '#18A4A2A2'
foo.unpack(ymd) # => ["2019", "10", "01"]
If the strings are consistent length and format, then I'd work with unpack, because, if I remember right, it is the fastest, followed by String slicing, with anchored, then unanchored regular expressions trailing.

Isolating URL paths with capturing groups

Is it possible to have n capture groups?
For example,
http://www.example.com/first-path
http://www.example.com/first-path/second-path
http://www.example.com/first-path/second-path/third-path
http://www.example.com/something.html
http://www.example.com/first-path?id=5
I am trying to capture first-path as group 1, second-path as group 2, and third-path as group 3 using http:\/\/(.*)\/(?!.*\/$)(.*), but it does not split the segments.
No specific programming language being used.
If you were using PHP, you could do something like this. The first split removes the leading http://www.example.com/ part, and the second then splits those values around the /:
$urls = array('http://www.example.com/first-path',
'http://www.example.com/first-path/second-path',
'http://www.example.com/first-path/second-path/third-path',
'http://www.example.com/something.html',
'http://www.example.com/first-path?id=5');
foreach ($urls as $url) {
$tail = preg_split('#https?://[^/]+/#', $url, -1, PREG_SPLIT_NO_EMPTY)[0];
$paths = preg_split('#/#', $tail);
print_r($paths);
}
Output:
Array
(
[0] => first-path
)
Array
(
[0] => first-path
[1] => second-path
)
Array
(
[0] => first-path
[1] => second-path
[2] => third-path
)
Array
(
[0] => something.html
)
Array
(
[0] => first-path?id=5
)
A similar thing could be done in Javascript:
let urls = ['http://www.example.com/first-path',
'http://www.example.com/first-path/second-path',
'http://www.example.com/first-path/second-path/third-path',
'http://www.example.com/something.html',
'http://www.example.com/first-path?id=5'];
console.log(urls.map(s => s.split(/https?:\/\/[^\/]+\//)[1].split(/\//)))
Output:
Array(5) […] ​
0: Array [ "first-path" ]
1: Array [ "first-path", "second-path" ]
2: Array(3) [ "first-path", "second-path", "third-path" ]
3: Array [ "something.html" ]
4: Array [ "first-path?id=5" ]

Regex pattern to match groups starting with pattern

I am extract data from a text stream which is data structured as such
/1-<id>/<recType>-<data>..repeat n times../1-<id>/#-<data>..repeat n times..
In the above, the "/1" field precedes the record data which can then have any number of following fields, each with choice of recType from 2 to 9 (also, each field starts with a "/")
For example:
/1-XXXX/2-YYYY/9-ZZZZ/1-AAAA/3-BBBB/5-CCCC/8=NNNN/9=DDDD/1-QQQQ/2-WWWW/3=PPPP/7-EEEE
So, there are three groups of data above
1=XXXX 2=YYYY 9=ZZZZ
1=AAAA 3=BBBB 5=CCCC 8=NNNN 9=DDDD
1=QQQQ 2=WWWW 3=PPPP 7=EEEE
Data is for simplicity, I know for certain that its only contains [A-Z0-9. ] but can be variable length (not just 4 chars as per example)
Now, the following expression sort of works, but its only capturing the first 2 fields of each group and none of the remaining fields...
/1-(?'fld1'[A-Z]+)/((?'fldNo'[2-9])-(?'fldData'[A-Z0-9\. ]+))
I know I need some sort of quantifier in there somewhere, but I do not know what or where to place it.
You can use a regex to match these blocks using 2 .NET regex features: 1) capture collection and 2) multiple capturing groups with the same name in the pattern. Then, we'll need some Linq magic to combine the captured data into a list of lists:
(?<fldNo>1)-(?'fldData'[^/]+)(?:/(?<fldNo>[2-9])[-=](?'fldData'[^/]+))*
Details:
(?<fldNo>1) - Group fldNo matching 1
- - a hyphen
(?'fldData'[^/]+) - Group "fldData" capturing 1+ chars other than /
(?:/(?<fldNo>[2-9])[-=](?'fldData'[^/]+))* - zero or more sequences of:
/ - a literal /
(?<fldNo>[2-9]) - 2 to 9 digit (Group "fldNo")
[-=] - a - or =
(?'fldData'[^/]+)- 1+ chars other than / (Group "fldData")
See the regex demo, results:
See C# demo:
using System;
using System.Linq;
using System.Text.RegularExpressions;
public class Test
{
public static void Main()
{
var str = "/1-XXXX/2-YYYY/9-ZZZZ/1-AAAA/3-BBBB/5-CCCC/8=NNNN/9=DDDD/1-QQQQ/2-WWWW/3=PPPP/7-EEEE";
var res = Regex.Matches(str, #"(?<fldNo>1)-(?'fldData'[^/]+)(?:/(?<fldNo>[2-9])[-=](?'fldData'[^/]+))*")
.Cast<Match>()
.Select(p => p.Groups["fldNo"].Captures.Cast<Capture>().Select(m => m.Value)
.Zip(p.Groups["fldData"].Captures.Cast<Capture>().Select(m => m.Value),
(first, second) => first + "=" + second))
.ToList();
foreach (var t in res)
Console.WriteLine(string.Join(" ", t));
}
}
I would suggest to first split the string by /1, then use a patern along these lines:
\/([1-9])[=-]([A-Z]+)
https://regex101.com/r/0nyzzZ/1
A single regex isn't the optimal tool for doing this (at least used in this way). The main reason is because your stream has a variable number of entries in it, and using a variable number of capture groups is not supported. I also noticed some of the values had "=" between them as well as the dash, which your current regex doesn't address.
The problem comes when you try and add a quantifier to a capture group - the group will only remember the last thing it captured, so if you add a quantifier, it will end up catching the first and last fields, leaving out all the rest of them. So something like this won't work:
\/1-(?'fld1'[A-Z]+)(?:\/(?'fldNo'[2-9])[-=](?'fldData'[A-Z]+))+
If your streams were all the same length, then a single regex could be used, but there's a way to do it using a foreach loop with a much simpler regex working on each part of your stream (so it verifies your stream as well when it goes along!)
Now I'm not sure what language you're working with when using this, but here is a solution in PHP that I think delivers what you need.
function extractFromStream($str)
{
/*
* Get an array of [num]-[letters] with explode. This will make an array that
* contains [0] => 1-AAAA, [1] => 2-BBBB ... etc
*/
$arr = explode("/", substr($str, 1));
$sorted = array();
$key = 0;
/*
* Sort this data into key->values based on numeric ordering.
* If the next one has a lower or equal starting number than the one before it,
* a new entry will be created. i.e. 2-aaaa => 1-cccc will cause a new
* entry to be made, just in case the stream doesn't always start with 1.
*/
foreach ($arr as $value)
{
// This will get the number at the start, and has the added bonus of making sure
// each bit is in the right format.
if (preg_match("/^([0-9]+)[=-]([A-Z]+)$/", $value, $matches)) {
$newKey = (int)$matches[1];
$match = $matches[2];
} else
throw new Exception("This is not a valid data stream!");
// This bit checks if we've got a lower starting number than last time.
if (isset($lastKey) && is_int($lastKey) && $newKey <= $lastKey)
$key += 1;
// Now sort them..
$sorted[$key][$newKey] = $match;
// This will be compared in the next iteration of the loop.
$lastKey = $newKey;
}
return $sorted;
}
Here's how you can use it...
$full = "/1-XXXX/2-YYYY/9-ZZZZ/1-AAAA/3-BBBB/5-CCCC/8=NNNN/9=DDDD/1-QQQQ/2-WWWW/3=PPPP/7-EEEE";
try {
$extracted = extractFromStream($full);
$stream1 = $extracted[0];
$stream2 = $extracted[1];
$stream3 = $extracted[2];
print "<pre>";
echo "Full extraction: \n";
print_r($extracted);
echo "\nFirst Stream:\n";
print_r($stream1);
echo "\nSecond Stream:\n";
print_r($stream2);
echo "\nThird Stream:\n";
print_r($stream3);
print "</pre>";
} catch (Exception $e) {
echo $e->getMessage();
}
This will print
Full extraction:
Array
(
[0] => Array
(
[1] => XXXX
[2] => YYYY
[9] => ZZZZ
)
[1] => Array
(
[1] => AAAA
[3] => BBBB
[5] => CCCC
[8] => NNNN
[9] => DDDD
)
[2] => Array
(
[1] => QQQQ
[2] => WWWW
[3] => PPPP
[7] => EEEE
)
)
First Stream:
Array
(
[1] => XXXX
[2] => YYYY
[9] => ZZZZ
)
Second Stream:
Array
(
[1] => AAAA
[3] => BBBB
[5] => CCCC
[8] => NNNN
[9] => DDDD
)
Third Stream:
Array
(
[1] => QQQQ
[2] => WWWW
[3] => PPPP
[7] => EEEE
)
So you can see you have the numbers as the array keys, and the values they correspond to, which are now readily accessible for further processing. I hope this helps you :)

Nesting the result of regular expression

I'm parsing some HTML like this
<h3>Movie1</h3>
<div class="time"><span>10:00</span><span>12:00</span></div>
<h3>Movie2</h3>
<div class="time"><span>13:00</span><span>15:00</span><span>18:00</span></div>
I'd like to get result array looks like this
0 =>
0 => Movie1
1 => Movie2
1 =>
0 =>
0 => 10:00
1 => 12:00
1 =>
0 => 13:00
1 => 15:00
2 => 18:00
I can do that on two steps
1) get the movie name and whole movie's schedule with tags by regexp like this
~<h3>(.*?)</h3>(?:.*?)<div class="time">(.*?)</div>~s
2) get time by regexp like this (I do it inside the loop for every movie I got on step 1)
~<span>([0-9]{2}:[0-9]{2})</span>~s
And it works well.
The question is that: is there a regular expression that gives me the same result in only one step?
I tried nested groups like this
~<h3>(.*?)</h3>(?:.*?)<div class="time">((<span>(.*?)</span>)*)</div>~s
and I got only the last time of every movie (only 12:00 and 18:00).
With DOMDocument:
$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$nodeList = $xpath->query('//h3|//div[#class="time"]/span');
$result = array();
$currentMovie = -1;
foreach ($nodeList as $node) {
if ($node->nodeName === 'h3') {
$result[0][++$currentMovie] = $node->nodeValue;
continue;
}
$result[1][$currentMovie][] = $node->nodeValue;
}
print_r($result);
Note: to be more rigorous, you can change the xpath query to:
//h3[following-sibling::div[#class="time"]] | //div[#class="time"]/span

Match EOL character?

I'm trying to capture the commands from a string of RTTTL commands like this:
2a4, 2e, 2d#, 2b4, 2a4, 2c, 2d, 2a#4, 2e., e, 1f4, 1a4, 1d#, 2e., d, 2c., b4, 1a4, 1p, 2a4, 2e, 2d#, 2b4, 2a4, 2c, 2d, 2a#4, 2e., e, 1f4, 1a4, 1d#, 2e., d, 2c., b4, 1a4
The regex I'm using is (\S+),|$ with global and multiline on, as I read that $ matches EOL when multiline mode is on, however this does not happen, and thus I cannot capture the last command 1a4, which ends the line. All the other commands are captured from the group.
What's the regex I should be using to capture the last command?
Just add a lookahead or non-capturing group like below. And get the string you want from group index 1.
(\S+)(?:,|$)
DEMO
OR
(\S+)(?=,|$)
DEMO
You don't need to have a capturing group also when using lookahead.
\S+(?=,|$)
(?=,|$) Positive lookahead asserts that the match must be followed by a , or end of the line anchor. \S+ matches one or more non-space characters.
another solution
$a = " 2a4, 2e, 2d#, 2b4, 2a4, 2c, 2d, 2a#4, 2e., e, 1f4, 1a4, 1d#, 2e., d, 2c., b4, 1a4, 1p, 2a4, 2e, 2d#, 2b4, 2a4, 2c, 2d, 2a#4, 2e., e, 1f4, 1a4, 1d#, 2e., d, 2c., b4, 1a4";
$r=explode(",",preg_replace("/\\s+/","",$a));
var_dump($r);
output:
array (size=37)
0 => string '2a4' (length=3)
1 => string '2e' (length=2)
2 => string '2d#' (length=3)
3 => string '2b4' (length=3)
4 => string '2a4' (length=3)
5 => string '2c' (length=2)
6 => string '2d' (length=2)
7 => string '2a#4' (length=4)
8 => string '2e.' (length=3)
9 => string 'e' (length=1)
10 => string '1f4' (length=3)
11 => string '1a4' (length=3)
12 => string '1d#' (length=3)
13 => string '2e.' (length=3)
14 => string 'd' (length=1)
15 => string '2c.' (length=3)
16 => string 'b4' (length=2)
17 => string '1a4' (length=3)
18 => string '1p' (length=2)
19 => string '2a4' (length=3)
20 => string '2e' (length=2)
21 => string '2d#' (length=3)
22 => string '2b4' (length=3)
23 => string '2a4' (length=3)
24 => string '2c' (length=2)
25 => string '2d' (length=2)
26 => string '2a#4' (length=4)
27 => string '2e.' (length=3)
28 => string 'e' (length=1)
29 => string '1f4' (length=3)
30 => string '1a4' (length=3)
31 => string '1d#' (length=3)
32 => string '2e.' (length=3)
33 => string 'd' (length=1)
34 => string '2c.' (length=3)
35 => string 'b4' (length=2)
36 => string '1a4' (length=3)