Compare columns in CSV and overide newlist - python-2.7

I'm new to python. I have two .csv files with identical columns, but my oldlist.csv has been edited for row[9] with employee names, the newlist.csv when generated defaults to certain domains for names. I want to be able to take the oldlist.csv compare to newlist.csv and override columns in newlist.csv with the data in row[9] from oldlist.csv. Thanks for your help.
Example: (oldlist) col1, col2 (newlist) col1, col2
1234, Bob 1234, Jane
I want to read oldlist, if col1 == col1 in newlist override col2 and I want to contine to write.write(row) for everything matching in col from oldlist

Related

Count and filter on the basis of third column in informatica

Like I have a question
Col1 Col2 Col3
45321_320 A Y
45321_320 A N
76453-10 A Y
45638_80 A Y
So we need to count the no of rows that have same col1 for example the first two rows should be considered as count=2 and rest as count=1 and after that count=2 or more that records need to filtered out on the basis of Col3=Y, so how we can do that in informatica
https://i.stack.imgur.com/JkxnG.png
This is little tricky. Pls follow below steps.
Sort the data base on col1.
Use agg to aggregate. Create a new col called count_col1.
create another col, cnt_col3_y = count(*, col3=y)
Join agg output with sorter output based on col1.
Put a filter. Logic should be
iif( count_col1>1 and cnt_col3>0, false, true)
Link output of filter to target.
This will generate output like below.
Col1 Col2 Col3
76453-10 A Y
45638_80 A Y
If you want different output let me know.

How to collapse sheet by pivoting rows into csv data

I have a set of data where an account id can have multiple rows of country. I'm looking for an array function that will give me a unique list of accounts with the countries in the second column as csv values e.g. country1,country1,country3.
If I unique the accounts, this query will do it per row but I'm really looking for an array so I don't have to maintain it as the number of rows grows.
=TEXTJOIN(",",1,UNIQUE(QUERY(A:B,"select B where A = '"&D2&"'",0)))
I have a sample sheet here.
try:
=INDEX(REGEXREPLACE(TRIM(SPLIT(FLATTEN(QUERY(QUERY(
IF(A2:A="",,{A2:A&"×", B2:B&","}),
"select max(Col2)
where not Col2 matches '^×|^$'
group by Col2
pivot Col1"),,9^9)), "×")), ",$", ))

How to update multiple columns in same update statement with one column depends upon another new column new value in Redshift

I want to update multiple columns in same update statement with one column depends upon another new column new value.
Example:
Sample Data: col1 and col2 is the column names and test_update is the table name.
SELECT * FROM test_update;
col1 col2
col-1 col-2
col-1 col-2
col-1 col-2
update test_update set col1 = 'new', col2=col1||'-new';
SELECT * FROM test_update;
col1 col2
new col-1-new
new col-1-new
new col-1-new
What I need to achieve is col2 is updated as new-new as we updated value of col1 is new.
I think may be its not possible in one SQL statement. If possible How can we do that, If its not What is best way of handling this problem in Data Warehouse environment, like execute multiple update 1st on col1 and then on col2 or any other.
Hoping my question is clear.
You cannot update the second column based on the result of updating the first column. However this can be achieved in a single by "pre-calculating" the result you want and then updating based on that.
The following update using a join is based on the example provided in the Redshift documentation:
UPDATE test_update
SET col1 = precalc.col1
, col2 = precalc.col2
FROM (
SELECT catid
, 'new' AS col1
, col1 || '-new' AS col2
FROM test_update
) precalc
WHERE test_update.id = precalc.id;
;

Google Sheets Queries and 'not in'

I am trying to piece together a query that does a few things. One, I want it to list out all the names in a column, and two, I only want it to list out the name from that column if it doesn't exist within an array of columns.
=QUERY(QUERY(Breakdown!$A$2:$B), "select Col1 where Col1 != '' and Col2 = 'Warrior' order by Col1 Asc")
I got as far as this, which displays all of the names in the column as I want it to, but when I start adding in 'not in' type parameters, I break it every which way. How do I check that Col1 doesn't exist in the range ='Raid Comp'!A2:Q10?
Here is the spreadsheet: https://docs.google.com/spreadsheets/d/1X0GiOCAAve1CR4A3JG2Ybf-daMvrrhAsZF5V3XEdn4E/edit?usp=sharing
What I am tryin to do is once a name is entered within the colored areas, if name entered exists in the list below the colored area, the name is removed from the list.
Example:
try regex in query:
=QUERY({Breakdown!$A$2:$B},
"select Col1
where Col2 = 'Warrior'
and not Col1 matches '"&TEXTJOIN("|", 1, 'Raid Comp'!A2:Q10)&"'
order by Col1 asc")

IF + AND / OR logic inside of a query

below is an example document I have shared:
https://docs.google.com/spreadsheets/d/1WuQIqn8DA12R0mNFGMdjJahQ0eNoxKODpSwopk7KoYU/edit#gid=0
My data is simple table:
I want to do the following:
For starting cell K7 on patient tab
I want to query the call log tab for
two main conditions.
Query select loqic: return rows D,E,F,A when certain conditions are met:
if text colC equals text in patient tab cell c7 AND col D says "No beds Available" And colI shows time left to calling greater than 0
OR If not than:
if col B=cell H3 in patient tab, and Col C= Cell C7 in patient tab
Thank you for your help
My example could help you.
Suppose you have a small data, like this, columns A:D:
Then you may use query state with two or more OR conditions, but insert them into parentheses. Sample formula:
=QUERY({A:D},"select Col1, Col2, Col3, Col4 where (Col1 < 7 and Col3 = 'c') or (Col2 = 'a' and Col4 > 0)")
To use Col1, Col2, Col3... notation inside query, data must be inside {}