string split in LINQ - regex

This is wat i tried using split
string[] req_info_texts = Regex.Matches(model_file_string_qts_corrected,
"RequirementInfo.*\"")
.OfType<Match>()
.Select(m=> m.Groups[0].Value.Split('\'').ToString())
.ToArray();
RequirementInfo.*\" Lines in the string "model_file_string_qts_corrected" is similar to
RequirementInfo "{'1' 2' 3'4 '5' 6'7' 8'syed_syed' 'SRDD_PFC_047602' } %GIDa_033022bd_8058_4216_8b9d_71454ba5f896"
There were n no of lines like above in the string .
I need syed_syed in the array req_info_texts .
But wat i get is index out of range exception.
Can u say wat the mistake is?

string[] req_info_texts = Regex.Matches(input,#"RequirementInfo.*\"")
.Cast<Match>()
.Select(m=> m.Value
.Split(''')
.Where(x=>x.Contains("syed_syed"))
.Single()
).ToArray();

Given your input string is
RequirementInfo "{'other' ' ' '' 'true' 'syed_syed_GRP001' 'klajdskfjadklsjfklsa' } %GIDa_ed66dae7_2d68_4d07_9c67_a1cf1cb614cc" RequirementInfo "{'other' ' ' '' 'true' 'syed_syed_GRP001' 'klajdskfjadklsjfklsa' } %GIDa_b9a766f9_2b2b_4ca8_98f4_f693055b4792" RequirementInfo "{'other' ' ' '' 'true' 'syed_syed_GRP004' 'klajdskfjadklsjfklsa' } %GIDa_271d5326_cb57_4d87_8cd9_66687c0a1d32" RequirementInfo "{'other' ' ' '' 'true' 'syed_syed_GRP03' 'klajdskfjadklsjfklsa' } %GIDa_07ed6119_91d2_41f9_94dc_69d518503d64"
just with newlines, as you said in a comment on another question, you just need two splits:
var infosString = "RequirementInfo \"{'other' ' ' '' 'true' 'syed_syed_GRP001' 'klajdskfjadklsjfklsa' } %GIDa_ed66dae7_2d68_4d07_9c67_a1cf1cb614cc\"\nRequirementInfo \"{'other' ' ' '' 'true' 'syed_syed_GRP001' 'klajdskfjadklsjfklsa' } %GIDa_b9a766f9_2b2b_4ca8_98f4_f693055b4792\"\n RequirementInfo \"{'other' ' ' '' 'true' 'syed_syed_GRP004' 'klajdskfjadklsjfklsa' } %GIDa_271d5326_cb57_4d87_8cd9_66687c0a1d32\"\n RequirementInfo \"{'other' ' ' '' 'true' 'syed_syed_GRP03' 'klajdskfjadklsjfklsa' } %GIDa_07ed6119_91d2_41f9_94dc_69d518503d64";
var result = infosString.Split('\n').Select(line => line.Split('\'')[9]).ToArray();
result is now
The first Split creates an array with the strings starting with RequirementInfo, and the Select splits these strings again and takes the 10th items (the ones starting with syed_syed).

Related

Redshift Copy from s3 using for loop

I have many files to load in S3.
And I have created manifest file at each prefix of the files.
for instance, at s3://my-bucket/unit_1
I have files like below.
chunk1.csv.gz
chunk2.csv.gz
chunk3.csv.gz
cunkk4.csv.gz
unit.manifest
so with copy command, I can load the unit_1 files to redshift
However, I got more than 1000 units so I want to do it with loop.
So I want to make loop that iterate from 1 to 1000 to change just prefix of the manifest file.
So I did like below,
create or replace procedure copy_loop()
language plpgsql
as $$
BEGIN
FOR i in 1..1000 LOOP
COPY mytable
FROM 's3://my-bucket/unit_%/unit.manifest', i
credentials 'aws_iam_role=arn:aws:iam::myrolearn'
MANIFEST
REGION 'ap-northeast-2'
REMOVEQUOTES
IGNOREHEADER 1
ESCAPE
DATEFORMAT 'auto'
TIMEFORMAT 'auto'
GZIP
DELIMITER '|'
ACCEPTINVCHARS '?'
COMPUPDATE FALSE
STATUPDATE FALSE
MAXERROR 0
BLANKSASNULL
EMPTYASNULL
NULL AS '\N'
EXPLICIT_IDS;
END LOOP;
END;
$$;
But I got this message
SQL Error [500310] [42601]: Amazon Invalid operation: syntax error at or near ",";
How can I handle this?
This is my solution.
create or replace procedure copy_loop(i1 int, i2 int)
language plpgsql
as $$
DECLARE
prefix TEXT := 's3://mybucket/unit_';
manifest TEXT := '/unit.manifest' ;
manifest_location TEXT ;
copy_commands VARCHAR(2000) ;
copy_options VARCHAR(2000) := 'credentials '|| quote_literal('aws_iam_role=myrolearn')
|| ' MANIFEST '
|| ' REGION ' || quote_literal('ap-northeast-2')
|| ' REMOVEQUOTES '
|| ' IGNOREHEADER 1 '
|| ' ESCAPE '
|| ' DATEFORMAT ' || quote_literal('auto')
|| ' TIMEFORMAT ' || quote_literal('auto')
|| ' GZIP '
|| ' DELIMITER ' || quote_literal('|')
|| ' ACCEPTINVCHARS ' || quote_literal('?')
|| ' COMPUPDATE FALSE '
|| ' STATUPDATE FALSE '
|| ' MAXERROR 0 '
|| ' BLANKSASNULL '
|| ' EMPTYASNULL '
|| ' NULL AS ' || quote_literal('\N')
|| ' EXPLICIT_IDS ';
BEGIN
FOR i in i1..i2 LOOP
manifest_location := prefix || i || manifest;
copy_commands := 'COPY mytable FROM' || quote_literal(manifest_location) || copy_options;
execute copy_commands;
END LOOP;
END;
$$;
using this procedure, I could copy files from more than 1000 units.
also set starting number and end number of the loop helped to divide the loading jobs. Since large amount loading takes few hours, I think it is better to do load job with some chunks.

How to read data from text file using python2.7?

can anybody try to help me to retrieve numbers in Python and each number to an array:
I have done the following code, it does the job but it reads 10 as two numbers:
with open("test.dat") as infile:
for i, line in enumerate(infile):
if i == 0:
for x in range(0, len(line)):
if(line[x] == ' ' or line[x] == " "):
continue
else:
print(x, " " , line[x], ", ")
initial_state.append(line[x])
---Results:
(0, ' ', '1', ', ')
(2, ' ', '2', ', ')
(4, ' ', '3', ', ')
(6, ' ', '4', ', ')
(8, ' ', '5', ', ')
(10, ' ', '6', ', ')
(12, ' ', '7', ', ')
(14, ' ', '8', ', ')
(16, ' ', '9', ', ')
(18, ' ', '1', ', ')
(19, ' ', '0', ', ')
(21, ' ', '1', ', ')
(22, ' ', '1', ', ')
(24, ' ', '1', ', ')
(25, ' ', '2', ', ')
(27, ' ', '1', ', ')
(28, ' ', '3', ', ')
(30, ' ', '1', ', ')
(31, ' ', '4', ', ')
(33, ' ', '1', ', ')
(34, ' ', '5', ', ')
(36, ' ', '0', ', ')
(37, ' ', '\n', ', ')
index include spaces, please see the line of numbers im trying to add to array
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0
Use .split() to split all fields by looping through, please see the following code, it should do it
with open("test.dat") as infile:
for i, line in enumerate(infile):
if i == 0: # if first line
field = [field.split(" ") for field in line.split(" ")]
for x in range(0, len(field)):
initial_state_arr.append(field[x])
If you are sure each number is separated by a single space why not just split the line and print each element as an array:
with open("test.dat") as infile:
content = infile.read().split()
for index, number in enumerate(content):
print ((index*2, number))
And what is your exact input and expected result? Does the file have multiple spaces between numbers?

Python remove empty element by list

i have this list that contains an empty element:
list = ['Caramanico Terme', ' ', 'Castellafiume', ' ', 'Castelvecchio Subequo', ' ', 'Falesia di ovindoli', ' ', 'Fara San Martino', ' ', "L'Aquila - Madonna d'Appari", ' ', 'La Palma Pazza (Bisegna AQ)', ' ', 'Liscia Palazzo', ' ', 'Luco dei marsi', ' ', 'Montebello di Bertona', ' ', 'Monticchio', ' ', 'Palena', ' ', 'Pennadomo', ' ', 'Pennapiedimonte', ' ', 'Pescomarrino', ' ', 'Petrella', ' ', 'Pianezza', ' ', 'Pietrasecca', ' ', ' ', 'PietrePiane', ' ', 'Pizzi di Lettopalena (loc. Fonte della Noce)', ' ', 'Placche di Bini', ' ', 'Roccamorice', ' ', 'Sasso di Lucoli', ' ', 'Villetta Barrea', ' ']
how i can remove this '' empty element?
I have try in this way:
[x for x in list if all(x)]
but the elements are not delete
Any help?
Thanks
First of all. Make sure to not call your list list. That's a built-in type and will cause problems later. I renamed it to lst. Then you can filter the list the following way:
lst = ['Caramanico Terme', ' ', 'Castellafiume', ' ', 'Castelvecchio Subequo', ' ', 'Falesia di ovindoli', ' ', 'Fara San Martino', ' ', "L'Aquila - Madonna d'Appari", ' ', 'La Palma Pazza (Bisegna AQ)', ' ', 'Liscia Palazzo', ' ', 'Luco dei marsi', ' ', 'Montebello di Bertona', ' ', 'Monticchio', ' ', 'Palena', ' ', 'Pennadomo', ' ', 'Pennapiedimonte', ' ', 'Pescomarrino', ' ', 'Petrella', ' ', 'Pianezza', ' ', 'Pietrasecca', ' ', ' ', 'PietrePiane', ' ', 'Pizzi di Lettopalena (loc. Fonte della Noce)', ' ', 'Placche di Bini', ' ', 'Roccamorice', ' ', 'Sasso di Lucoli', ' ', 'Villetta Barrea', ' ']
filtered = [x for x in lst if len(x.strip()) > 0]
This will remove all kinds of whitepace elements like ' ' or ' ' etc.
EDIT:
As corn3lius pointed out, this would work too:
filtered = [x for x in lst if x.strip()]
You can add a condition in comprehension list:
l = ['Caramanico Terme', ' ', 'Castellafiume', ' ', 'Castelvecchio Subequo', ' ', 'Falesia di ovindoli', ' ', 'Fara San Martino', ' ', "L'Aquila - Madonna d'Appari", ' ', 'La Palma Pazza (Bisegna AQ)', ' ', 'Liscia Palazzo', ' ', 'Luco dei marsi', ' ', 'Montebello di Bertona', ' ', 'Monticchio', ' ', 'Palena', ' ', 'Pennadomo', ' ', 'Pennapiedimonte', ' ', 'Pescomarrino', ' ', 'Petrella', ' ', 'Pianezza', ' ', 'Pietrasecca', ' ', ' ', 'PietrePiane', ' ', 'Pizzi di Lettopalena (loc. Fonte della Noce)', ' ', 'Placche di Bini', ' ', 'Roccamorice', ' ', 'Sasso di Lucoli', ' ', 'Villetta Barrea', ' ']
print([l for l in list if l != ' '])
Removing all items that not is ' ' i.e. the empty string is the same thing as building a set with all elements from the first set that has length > 0. This one liner takes care of that:
a = ['', 'apple', '', 'peach']
b = [i for i in a if i != '']
Removing empty items from list. Here empty items might be in single space or multiple space within quotes. So, use strip() function in list comprehension.
Ex:
temp_str = ' || 0X0C || 0X00000 || 0X00094 || 0X00E8C || IN_OPER || 000000e8cff7e000 || '
temp_str.split('||')
# result: [' ', ' 0X0C ', ' 0X00000 ', ' 0X00094 ', ' 0X00E8C ', ' IN_OPER ', ' 000000e8cff7e000 ', ' ']
temp_list = [ x for x in temp_str.split('||') if x]
temp_list
# result: [' ', ' 0X0C ', ' 0X00000 ', ' 0X00094 ', ' 0X00E8C ', ' IN_OPER ', ' 000000e8cff7e000 ', ' ']
temp_list = [ x for x in temp_str.split('||') if x.strip()]
temp_list
# result: [' 0X0C ', ' 0X00000 ', ' 0X00094 ', ' 0X00E8C ', ' IN_OPER ', ' 000000e8cff7e000 ']
temp_list = [ x.strip() for x in temp_str.split('||') if x.strip()]
temp_list
# result: ['0X0C', '0X00000', '0X00094', '0X00E8C', 'IN_OPER', '000000e8cff7e000']

Reading a maze file in python and printing out the maze line by line

I am creating a maze traverser in Python. Initially I read the maze txt file as a list, but I am unable to print the maze line by line. We are given the number of rows and columns, the row and column of the entrance, and row and column of the exit.
What my ouput is:
[['5', ' ', '5', ' ', '4', ' ', '1', ' ', '0', ' ', '1'], ['#', ' ', '#', '#', '#'], ['#', ' ', '#', ' ', '#'], ['#', ' ', '#', ' ', '#'], ['#', ' ', ' ', ' ', '#'], ['#', ' ', '#', '#', '#']]
what I am looking for:
5 5 4 1 0 1
# ###
# # #
# # #
# #
# ###
My test code to print out the maze:
#read MAZE and print
def readMaze(maze, filename):
mazeFile = open(filename, "r")
columns = mazeFile.readlines()
for column in columns:
column = column.strip()
row = [i for i in column]
maze.append(row)
maze =[]
readMaze(maze, "maze01.txt")
print maze
If your maze list is like this:
maze = [['5', ' ', '5', ' ', '4', ' ', '1', ' ', '0', ' ', '1'], ['#', ' ', '#', '#', '#'], ['#', ' ', '#', ' ', '#'], ['#', ' ', '#', ' ', '#'], ['#', ' ', ' ', ' ', '#'], ['#', ' ', '#', '#', '#']]
You could print it and have your desired print output using join and a for loop like this example:
for i in maze:
print("".join(i))
Output:
5 5 4 1 0 1
# ###
# # #
# # #
# #
# ###
You're simply printing the whole list without ever iterating over it to print the characters how you want them. You need to use a for loop just like you have in your readMaze function to iterate over the top-level list, and on each element (which is a list of characters), use join to concatenate the characters into one string, print it, then move onto the next line
# your input list has multiple nested sub-lists
l = [
['5', ' ', '5', ' ', '4', ' ', '1', ' ', '0', ' ', '1'],
['#', ' ', '#', '#', '#'],
['#', ' ', '#', ' ', '#'],
['#', ' ', '#', ' ', '#'],
['#', ' ', ' ', ' ', '#'],
['#', ' ', '#', '#', '#']
]
# so we iterate over them...
for sublist in l:
print(''.join(sublist)) # ...and concatenate them together before printing
Output:
5 5 4 1 0 1
# ###
# # #
# # #
# #
# ###

Convert CamelCase string to uppercase with underscore

I have a string "CamelCase", I use this RegEx :
string pattern = "(?<!(^|[A-Z]))(?=[A-Z])|(?<!^)(?=[A-Z][a-z])";
string[] substrings = Regex.Split("CamelCase", pattern);
In substring, I have Camel and Case, that's find, but I'd like all in uppercase like this CAMEL and CASE. Better, I'd like get a string like this CAMEL_CASE but pease ALL with Regex.
Here is a JavaScript implementation.
function camelCaseToUpperCase(str) {
return str.replace(/([a-z])([A-Z])/, '$1_$2').toUpperCase();
}
Demo
printList([ 'CamelCase', 'camelCase' ],
function(value, idx, values) {
return value + ' -> '
+ camelCaseToUpperCase(value) + ' -> '
+ camelToTitle(value, '_');
}
);
// Case Conversion Functions
function camelCaseToUpperCase(str) {
return str.replace(/([a-z])([A-Z])/, '$1_$2').toUpperCase();
}
function camelToTitle(str, delimiter) {
return str.replace(/([A-Z][a-z]+)/g, ' $1') // Words beginning with UC
.replace(/([A-Z][A-Z]+)/g, ' $1') // "Words" of only UC
.replace(/([^A-Za-z ]+)/g, ' $1') // "Words" of non-letters
.trim() // Remove any leading/trailing spaces
.replace(/[ ]/g, delimiter || ' '); // Replace all spaces with the delim
}
// Utility Functions
function printList(items, conversionFn) {
var str = '<ul>';
[].forEach.call(items, function(item, index) {
str += '<li>' + conversionFn(item, index, items) + '</li>';
});
print(str + '</ul>');
}
function print() {
write.apply(undefined, arguments);
}
function println() {
write.apply(undefined, [].splice.call(arguments,0).concat('<br />'));
}
function write() {
document.getElementById('output').innerHTML += arguments.length > 1 ?
[].join.call(arguments, ' ') : arguments[0]
}
#output {
font-family: monospace;
}
<h1>Case Conversion Demo</h1>
<div id="output"></div>
In Perl you can do this:
$string = "CamelCase";
$string =~ s/((?<=[a-z])[A-Z][a-z]+)/_\U$1/g;
$string =~ s/(\b[A-Z][a-z]+)/\U$1/g;
print "$string\n";
The replacement uses \U to convert the found group to uppercase.
That can be compressed into a single regex using Perl's e option to evaluate a replacement:
$string = "CamelCase";
$string =~ s/(?:\b|(?<=([a-z])))([A-Z][a-z]+)/(defined($1) ? "_" : "") . uc($2)/eg;
print "$string\n";
Using sed and tr unix utilities (from your terminal)...
echo "CamelCase" | sed -e 's/\([A-Z]\)/-\1/g' -e 's/^-//' | tr '-' '_' | tr '[:lower:]' '[:upper:]'
If you have camel case strings with "ID" at the end and you'd like to keep it that way, then use this one...
echo "CamelCaseID" | sed -e 's/\([A-Z]\)/-\1/g' -e 's/^-//' | tr '-' '_' | tr '[:lower:]' '[:upper:]' | sed -e 's/I_D$/ID/g'
By extending the String class in ruby...
class String
def camelcase_to_underscore
self.gsub(/::/, '/').
gsub(/([A-Z]+)([A-Z][a-z])/,'\1_\2').
gsub(/([a-z\d])([A-Z])/,'\1_\2').
tr("-", "_").
upcase
end
end
Now, you can execute the camelcase_to_underscore method on any string. Example:
>> "CamelCase".camelcase_to_underscore
=> "CAMEL_CASE"
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string input = "CamelCase";
string output = Regex.Replace(input,
#"(?:\b|(?<=([A-Za-z])))([A-Z][a-z]*)",
m => string.Format(#"{0}{1}",
(m.Groups[1].Value.Length > 0)? "_" : "", m.Groups[2].Value.ToUpper()));
Console.WriteLine(output);
}
}
Test this code here.