I'm using the Birt list element to display my data from left to right. (see this question as reference). Eg. List element with a Grid in details and the grid set to inline.
The issue I'm facing now is, that the different rows in the grid are not aligned left to right (probably due to some rows having empty values in some fields). How can I force BIRT to align properly?
EDIT:
This is especially also a problem with longer text that wraps to more than 1 line. The wrapping /multiple lines should be reflected by all list elements in that "row of the output".
Unfortunately, I don't see any chance to accomplish this easily in the generic case - that is, if the number of records is unknown in advance, so you'd need more than one line:
student1 student2 student3
student4 student5
Let's call those line "main lines". One main line can contain up to 3 records. The number 3 may be different in your case, but we can assume it is a constant, since (at least for PDF reports) the paper width is restricted.
A possible solution could work like this:
In your data set, add two columns for each row: MAIN_LINE_NUM and COLUMN_NUM, where the meaning is obvious. For example, this could be done with pure SQL using analytic functions (untested):
select ...,
trunc((row_number() over (order by ...whatever...) - 1) / 3) + 1 as MAIN_LINE_NUM,
mod(row_number() over (order by ...whatever...) - 1), 3) +1 as COLUMN_NUM
from ...
order by ...whatever... -- The order must be the same as above.
Now you know where each record should go.
The next task is to transform the result set into a form where each record looks like this (for the example, think that you have 3 properties STUDENT_ID, NAME, ADDRESS for each student):
MAIN_LINE
STUDENT_ID_1
NAME_1
ADDRESS_1
STUDENT_ID_2
NAME_2
ADDRESS_2
STUDENT_ID_3
NAME_3
ADDRESS_3
You get the picture...
The SQL trick to achieve this is one that one should know.
I'll show this for the STUDENT_ID_1, STUDENT_ID_2 and NAME_1 column as an example:
with my_data as
( ... the query shown above including MAIN_LINE_NUM and COLUMN_NUM ...
)
select MAIN_LINE_NUM,
max(case when COLUMN_NUM=1 then STUDENT_ID else null end) as STUDENT_ID_1,
max(case when COLUMN_NUM=2 then STUDENT_ID else null end) as STUDENT_ID_2,
...
max(case when COLUMN_NUM=1 then NAME else null end) as NAME_1,
...
from my_data
group by MAIN_LINE_NUM
order by MAIN_LINE_NUM
As you see, this is quite clumsy if you need a lot of different columns.
On the other hand, this makes the output a lot easier.
Create a table item for your dat set, with 3 columns (for 1, 2, 3). It's best to not drag the dataset into the layout. Instead, use the "Insert element" context menu.
You need a detail row for each of the columns (STUDENT_ID, NAME, ADDRESS). So, add two more details rows (the default is one detail row).
Add header labels manually, if you like, or remove the header row if you don't need it (which is what I assume).
Remove the footer row, as you probably don't need it.
Drag the columns to the corresponding position in your table.
The table item should look like this now (data items):
+--------------+--------------+-------------+
+ STUDENT_ID_1 | STUDENT_ID_2 | STUDENT_ID3 |
+--------------+--------------+-------------+
+ NAME_1 | NAME_2 | NAME_3 |
+--------------+--------------+-------------+
+ ADDRESS_1 | ADDRESS_2 | ADDRESS_3 |
+--------------+--------------+-------------+
That's it!
This is one of the few examples where BIRT sucks IMHO in comparison to other tools like e.g. Oracle Reports - excuse my Klatchian.
Related
in my Google sheet table I have the first list with summary of invoices which are then separated to 4 lists according to parameters (manually). I need to know about all invoices from the first list, on which category/list they are.
So for example - lists: Alphabet, abc, def, mno, xyz. In Alphabet is column "list".
How to write function which found invoice on another list according to ID (column B) from Alphabet and write name of the correct list to column "list". I tried to write this function using IF, match, etc. But I still don't have solution. Can you help me please? Sorry for my English :-)
So here is an example which you could adapt. In columns E:H on the first sheet (and I could hide these columns later, starting in row2 and dragging down as needed, I put the following formulas:
=IF(LEN(iferror(query(abc!$A$2:$A,"select A where A='" & $A2 &"'"),""))>0,"abc","")
=IF(LEN(iferror(query(def!$A$2:$A,"select A where A='" & $A2 &"'"),""))>0,"def","")
=IF(LEN(iferror(query(mno!$A$2:$A,"select A where A='" & $A2 &"'"),""))>0,"mno","")
=IF(LEN(iferror(query(xyz!$A$2:$A,"select A where A='" & $A2 &"'"),""))>0,"xyz","")
Probably I could have simplified a little by putting the sheet names in E1:H1, but you get the idea.
Each of these looks for the ID. If the query succeeds, it returns the name of the sheet. If it fails, it returns the empty string.
Now in column B where I actually want the results, I put this formula in B2 and drag to copy as needed.
=if(E2&F2&G2&H2="","nowhere",E2&F2&G2&H2)
It says put those strings together, and if there is nothing there say nowhere, otherwise say the list. If it appears on more than one, and that can really happen, you could use JOIN instead.
Goal: I have a bunch of keywords I'd like to categorise automatically based on topic parameters I set. Categories that match must be in the same column so the keyword data can be filtered.
e.g. If I have "Puppies" as a first topic, it shouldn't appear as a secondary or third topic otherwise the data cannot be filtered as needed.
Example Data: https://docs.google.com/spreadsheets/d/1TWYepApOtWDlwoTP8zkaflD7AoxD_LZ4PxssSpFlrWQ/edit?usp=sharing
Video: https://drive.google.com/file/d/11T5hhyestKRY4GpuwC7RF6tx-xQudNok/view?usp=sharing
Parameters Tab: I will add words in columns D-F that change based on the keyword data set and there will often be hundreds, if not thousands, of options for larger data sets.
Categories Tab: I'd like to have a formula or script that goes down the columns D-F in Parameters and fills in a corresponding value (in Categories! columns D-F respectively) based on partial match with column B or C (makes no difference to me if there's a delimiter like a space or not. Final data sheet should only have one of these columns though).
Things I've Tried:
I've tried a bunch of things. Nested IF formula with regexmatch works but seems clunky.
e.g. this formula in Categories! column D
=IF(REGEXMATCH($B2,LOWER(Parameters!$D$3)),Parameters!$D$3,IF(REGEXMATCH($B2,LOWER(Parameters!$D$4)),Parameters!$D$4,""))
I nested more statements changing out to the next cell in Parameters!D column (as in , manually adding $D$5, $D$6 etc) but this seems inefficient for a list thousands of words long. e.g. third topic will get very long once all dog breed types are added.
Any tips?
Functionality I haven't worked out:
if a string in Categories B or C contains more than one topic in the parameters I set out, is there a way I can have the first 2 to show instead of just the first one?
e.g. Cell A14 in Categories, how can I get a formula/automation to add both "Akita" & "German Shepherd" into the third topic? Concatenation with a CHAR(10) to add to new line is ideal format here. There will be other keywords that won't have both in there in which case these values will just show up individually.
Since this data set has a bunch of mixed breeds and all breeds are added as a third topic, it would be great to differentiate interest in mixes vs pure breeds without confusion.
Any ideas will be greatly appreciated! Also, I'm open to variations in layout and functionality of the spreadsheet in case you have a more creative solution. I just care about efficiently automating a tedious task!!
Try using custom function:
To create custom function:
1.Create or open a spreadsheet in Google Sheets.
2.Select the menu item Tools > Script editor.
3.Delete any code in the script editor and copy and paste the code below into the script editor.
4.At the top, click Save save.
To use custom function:
1.Click the cell where you want to use the function.
2.Type an equals sign (=) followed by the function name and any input value — for example, =DOUBLE(A1) — and press Enter.
3.The cell will momentarily display Loading..., then return the result.
Code:
function matchTopic(p, str) {
var params = p.flat(); //Convert 2d array into 1d
var buildRegex = params.map(i => '(' + i + ')').join('|'); //convert array into series of capturing groups. Example (Dog)|(Puppies)
var regex = new RegExp(buildRegex,"gi");
var results = str.match(regex);
if(results){
// The for loops below will convert the first character of each word to Uppercase
for(var i = 0 ; i < results.length ; i++){
var words = results[i].split(" ");
for (let j = 0; j < words.length; j++) {
words[j] = words[j][0].toUpperCase() + words[j].substr(1);
}
results[i] = words.join(" ");
}
return results.join(","); //return with comma separator
}else{
return ""; //return blank if result is null
}
}
Example Usage:
Parameters:
First Topic:
Second Topic:
Third Topic:
Reference:
Custom Functions
I've added a new sheet ("Erik Help") with separate formulas (highlighted in green currently) for each of your keyword columns. They are each essentially the same except for specific column references, so I'll include only the "First Topic" formula here:
=ArrayFormula({"First Topic";IF(A2:A="",,IFERROR(REGEXEXTRACT(LOWER(B2:B&C2:C),JOIN("|",LOWER(FILTER(Parameters!D3:D,Parameters!D3:D<>""))))) & IFERROR(CHAR(10)®EXEXTRACT(REGEXREPLACE(LOWER(B2:B&C2:C),IFERROR(REGEXEXTRACT(LOWER(B2:B&C2:C),JOIN("|",LOWER(FILTER(Parameters!D3:D,Parameters!D3:D<>""))))),""),JOIN("|",LOWER(FILTER(Parameters!D3:D,Parameters!D3:D<>""))))))})
This formula first creates the header (which can be changed within the formula itself as you like).
The opening IF condition leaves any row in the results column blank if the corresponding cell in Column A of that row is also blank.
JOIN is used to form a concatenated string of all keywords separated by the pipe symbol, which REGEXEXTRACT interprets as OR.
IFERROR(REGEXEXTRACT(LOWER(B2:B&C2:C),JOIN("|",LOWER(FILTER(Parameters!D3:D,Parameters!D3:D<>""))))) will attempt to extract any of the keywords from each concatenated string in Columns B and C. If none is found, IFERROR will return null.
Then a second-round attempt is made:
& IFERROR(CHAR(10)®EXEXTRACT(REGEXREPLACE(LOWER(B2:B&C2:C),IFERROR(REGEXEXTRACT(LOWER(B2:B&C2:C),JOIN("|",LOWER(FILTER(Parameters!D3:D,Parameters!D3:D<>""))))),""),JOIN("|",LOWER(FILTER(Parameters!D3:D,Parameters!D3:D<>"")))))
Only this time, REGEXREPLACE is used to replace the results of the first round with null, thus eliminating them from being found in round two. This will cause any second listing from the JOIN clause to be found, if one exists. Otherwise, IFERROR again returns null for round two.
CHAR(10) is the new-line character.
I've written each of the three formulas to return up to two results for each keyword column. If that is not your intention for "First Topic" and "Second Topic" (i.e., if you only wanted a maximum of one result for each of those columns), just select and delete the entire round-two portion of the formula shown above from the formula in each of those columns.
I'm trying to replace "x" in each column (excepts for the first 2 columns) with the column name in a table with an unknown number of columns but with at least 2 columns.
I found the code to change one column, but I want it to be dynamic:
#"Ersatt värde" = Table.ReplaceValue(Källa,"x", Table.ColumnNames(Källa){2},Replacer.ReplaceText,{Table.ColumnNames(Källa){2}})
Any ideas on how to solve it?
If I understand correctly, I think you can try either approach below:
#"Ersatt värde" =
let
columnsToTransform = List.Skip(Table.ColumnNames(Källa), 2),
accumulated = List.Accumulate(columnsToTransform, Källa, (tableState as table, columnName as text) =>
Table.ReplaceValue(tableState,"x", columnName, Replacer.ReplaceText, {columnName})
)
in accumulated
or:
#"Ersatt värde" =
let
columnsToTransform = List.Skip(Table.ColumnNames(Källa), 2),
transformations = List.Transform(columnsToTransform, (columnName) => {columnName, each
Replacer.ReplaceText(Text.From(_), "x", columnName)}),
transformed = Table.TransformColumns(Källa, transformations)
in transformed,
Both ways follow a similar approach:
Figure out which columns to do replacements in (i.e. all except the first 2 columns)
Loop over columns determined in previous step and actually do the replacement.
I've used Replacer.ReplaceText since that's what you'd used in your question, but I believe this will replace both partial matches and full matches.
If you only want full matches to be replaced, I think you can use Replacer.ReplaceValue instead.
I'm trying to create "Sale Rep" summaries by "Shop", where I can simply filter a column by the rep's name, them populate a total sales for each shop next to the relevant filter result.
I'm using this to filter all the Stores by Scott:
=(filter(D25:D47,A25:A47 = "Scott"))
Next, want to associate the Store/Account in F to populate with the corresponding value of E inside of G. So, G25 should populate the value of E25 ($724), G26 with E26 ($822), and F27 with E38 ($511.50)
I don't know how to write the formula correctly, but something like this is what I'm trying to do: =IF(F25=D25:D38),E25 I know that's not right, and it won't work in a fill down. But I'm basically trying to look for and copy over the correct value match of D and E inside of G. So, Misty Mountain Medicince in F27 will be matched to the value of E38 and populated in G27.
The filter is what's throwing me off, because it's not a simple fill down. And I don't know how to match filtered results from one column to a matched value in another.
Hope the screenshot helps. Screenshot of table:
Change Field Rep: Scott to Scott and you might apply:
=query(A25:E38,"select D,E where A='"&F24&"'")
// Enter the following into G25 and copy down column G
=(filter(E25:E47, D25:D47 = F25))
or
// Enter the following into G25 will expand with content in F upto row 47
=ArrayFormula(IF(F25:F47 <> 0, VLOOKUP(F25:F47, D25:E47, 2, FALSE),))
I am working on scraping a table that has major and minor column names. When I do this, the table comes in having read both the column names and column groups, so the column names are misaligned in the dataframe like so (simplified):
unnamed1 unnamed2 unnamed3 Year Passing Rushing Receiving
2015 NA 200 60 NA NA NA
2014 NA 180 70 NA NA NA
My challenge is in shifting the column names so that 'Year' aligns over '2015' and so forth. The problem is then that the number of columns to shift does not remain constant from table to table (this is only one of many). My code at the moment looks like the following:
table1=read_html('http://www.pro-football-reference.com/players/T/TyexWi00.htm')
df=table1[0]
to_shift=len(df.dropna(how='all', axis=1).columns) #Number of empty columns to shift by
df2=df.dropna(how='all',axis=1) #Drop the empty columns
df2.columns=df.columns[-to_shift:] #Shift all columns left by the number i've found
The problem is that for a player that has none of one stat (passing in this simple example), there are completely blank columns in the middle of the dataframe as well as at the right end, so that the code shifts too far. Is there a clean way of counting the columns from right to left until one is not completely empty?
Much thanks, and I hope my question is clear!
Is there a clean way of counting the columns from right to left until one is not completely empty?
from itertools import takewhile
len(df.columns) - len(list(takewhile(lambda col: df[col].isnull().all(), reversed(df.columns)))) - 1
Explanation:
takewhile returns all elements of a list (beginning at the front) until the given condition is False. When we call it on reversed(df.columns), we get all elements from the end. With df[col].isnull().all() we can check whether all entries of a column are null (a.k.a. nan). Consequently the above takewhile expression returns the suffix of columns which are completely 'empty'. By calculating total_length - bad_suffix_length - 1, we get the first index for which the condition is not satisfied.
Adding to the correct response from Michael Hoff (Thank you very much!), the code has been edited to
to_shift=len(df.columns) - len(list(takewhile(lambda col: df[col].isnull().all(), reversed(df.columns)))) #Index of origianl dataframe to keep
df2=df.drop(list(takewhile(lambda col: df[col].isnull().all(), reversed(df.columns))),axis=1) #Drop the empty right side columns
colnames=df.columns[-to_shift:]
df2.columns=colnames