Concatenate text with newlines in PowerBI - powerbi

If I have a column in a table where each cell contains text, how can I push them as output into e.g. a card and separate the cells with a new line?
I have been using the CONCATENATEX function, which takes a delimiter argument; however the standard new line character ('\n') doesn't work.

It is possible to pass Unicode characters to the CONCATENATEX function, using the UNICHAR(number) function.
The number parameter corresponds what looks to be the decimal UTF-16 or UTF-32 encodings (as shown here).
This means a new line is given by UNICHAR(10).
A final solution might then be: CONCATENATEX(TableName, TableName[TextColumn], UNICHAR(10))
Here is a screenshot that shows:
the input table in Excel (top left)
The table once imported into Power BI Desktop (top right)
The Measure 'Description' and the output within a Card object (bottom)
In the last line of the Measure code, marked yellow, you can see the use of UNICHAR(10) as a new line separator.
If nothing were to be selected in the Slicer object (i.e. everything is selected by default - no filter is used), then "Show other text" would be displayed in the Card.

In the concatenate code, if you insert a "shift + enter" after the comma, it gives me a line break, without breaking the code.
Example:
'Query1'[LetterCode],
",<Inserted SHIFT-ENTER here>
",
'Query1'[LetterCode],

Substitute /n with the string "#(cr)#(lf)" to create a new line.

Related

BigQuery remove <0x00> hidden characters from a column

I have a table with unwanted hidden characters such as my_table:
id
fruits
1
STuff1 stuff_2 ����������������������
2
Blahblah-blahblah �������������
3
nothing
How do I remove ���������������������� when selecting this column?
Current query:
SELECT fruits, TRIM(REGEXP_REPLACE(fruits, r'[^a-zA-Z,0-9,-]', ' ')) AS new_fruits
FROM `project-id.MYDATASET.my_table`
This query is too flaw because I'm worried if I accidentally exclude/replace important data. I only want to be specific on this weird characters.
Upon opening the data as csv, the weird characters shows as <0x00>. How do I solve this?
First you have to identify which is this character, because as it is a non printable this sign is just a random representation. For replace it without remove any other important information, do the following:
identify the hexadecimal of the character. Copy from csv and past on this site:
Use the replace function in bigquery replacing the char of this hex, as following:
SELECT trim(replace(string_field_1,chr(0xfffd)," ")) FROM `<project>.<dataset>.<table>`;
if your character result is different than fffd, put you value on the chr() function

Adding a space within a line in file with a specific pattern

I have a file with some data as follows:
795 0.16254624E+01-0.40318151E-03 0.45064186E+04
I want to add a space before the third number using search and replace as
795 0.16254624E+01 -0.40318151E-03 0.45064186E+04
The regular expression for the search is \d - \d. But what should I write in replace, so that I could get the above output. I have over 4000 of similar lines above and cannot do it manually. Also, can I do it in python, if possible.
Perhaps you could findall to get your matches and then use join with a whitespace to return a string where your values separated by a whitespace.
[+-]?\d+(?:\.\d+E[+-]\d+)?\b
import re
regex = r"[+-]?\d+(?:\.\d+E[+-]\d+)?\b"
test_str = "795 0.16254624E+01-0.40318151E-03 0.45064186E+04"
matches = re.findall(regex, test_str)
print(" ".join(matches))
Demo
You could do it very easily in MS Excel.
copy the content of your file into new excel sheet, in one column
select the complete column and from the data ribbon select Text to column
a wizard dialog will appear, select fixed width , then next.
click just on the location where you want to add the new space to tell excel to just split the text after this location into new column and click next
select each column header and in the column data format select text to keep all formatting and click finish
you can then copy all the new column or or export it to new text file

How can I normalize / asciify Unicode characters in Google Sheets?

I'm trying to write a formula for Google Sheets which will convert Unicode characters with diacritics to their plain ASCII equivalents.
I see that Google uses RE2 in its "REGEXREPLACE" function. And I see that RE2 offers Unicode character classes.
I tried to write a formula (similar to this one):
REGEXREPLACE("público","(\pL)\pM*","$1")
But Sheets produces the following error:
Function REGEXREPLACE parameter 2 value "\pL" is not a valid regular expression.
I suppose I could write a formula consisting of a long set of nested SUBSTITUTE functions (Like this one), but that seems pretty awful.
Can any offer a suggestion for a better way to normalize Unicode letters with diacritical/accent marks in a Google Sheets formula?
[[:^alpha:]] (negated ASCII character class) works fine for REGEXEXTRACT formula.
But =REGEXREPLACE("público","([[:alpha:]])[[:^alpha:]]","$1") gives "pblic" as a result. So, I guess, formula doesn't know what exact ASCII character must replace "ú".
Workaround
Let's take the word públicē; we need to replace two symbols in it. Put this word in cell A1, and this formula in cell B1:
=JOIN("",ArrayFormula(IFERROR(VLOOKUP(SPLIT(REGEXREPLACE(A1,"(.)","$1-"),"-"),D:E,2,0),SPLIT(REGEXREPLACE(A1,"(.)","$1-"),"-"))))
And then make directory of replacements in range D:E:
D E
1 ú u
2 ē e
3 ... ...
This formula is still ugly, but more useful because you can control your directory by adding more characters to the table.
Or use Java Script
Also found a good solution, which works in google sheets.
This did it for me in Google Sheets, Google Apps Scripts, GAS
function normalizetext(text) {
var weird = 'öüóőúéáàűíÖÜÓŐÚÉÁÀŰÍçÇ!#£$%^&*()_+?/*."';
var normalized = 'ouooueaauiOUOOUEAAUIcC ';
var idoff = -1,new_text = '';
var lentext = text.toString().length -1
for (i = 0; i <= lentext; i++) {
idoff = weird.search(text.charAt(i));
if (idoff == -1) {
new_text = new_text + text.charAt(i);
} else {
new_text = new_text + normalized.charAt(idoff);
}
}
return new_text;
}
This answer doesn't require a Google App Script, and it's still fast, and relatively simple. It builds on Max's answer by providing a full lookup table, and it also allows for case-sensitive transliteration (normally VLOOKUP is NOT case-sensitive).
Here is a link to the Google Spreadsheet if you want to jump right into it. If you want to use your own sheet, you'll need to copy the TRANS_TABLE sheet into your Spreadsheet.
In the code snippet below, the source cell is A2, so you'd place this formula in any column on row 2. Using REGEXREPLACE AND SPLIT, we split apart the string in A2 into an array of characters, then USING ARRAYFORMULA, we do the following to EACH character in the array: First, the character is converted to its 'decimal' CODE equivalent, then matched against a table on the TRANS_TABLE sheet by that number, then using VLOOKUP, a character X number of columns over (the index value provided) on the TRANS_TABLE sheet (in this case, the 3rd column over) is returned. When all characters in the array have been transliterated, we finally JOIN the array of characters back into a single string. I provided examples with named ranges as well.
=iferror(
join(
"",
ARRAYFORMULA(
vlookup(
code(split(REGEXREPLACE($A2,"(.)", "$1;"),";",TRUE)),
TRANS_TABLE!$A$5:$F,3
)
)
)
,)
You'll note on the TRANS_TABLE sheet I made, I created 4 different transliteration columns, which makes it easy to have a column for each of your transliteration needs. To reference the column, just use a different index number in the VLOOKUP. Each column is simply a replacement character column. In some cases, you don't want any conversion made (A -> A or 3 -> 3), so you just copy the same character from the source Glyph column. Where you DO want to convert characters, you type in whatever character you want replaced (ñ -> n etc). If you want a character removed altogether, you leave the cell blank (? -> ''). You can see examples of the transliteration output on the data sheet in which I created 4 different transliteration columns (A-D) referencing each of the Transliteration tables from the TRANS_TABLE sheet for different use case scenarios.
I hope this finally answers your question in a fashion that isn't so "ugly." Cheers.

Regular Expression in ms excel

How can I use regular expression in excel ?
In above image I have column A and B. I have some values in column A. Here I need to move data after = in column B. For e.g. here in 1st row I have SELECT=Hello World. Here I want to remove = sign and move Hello world in column B. How can I do such thing?
Stackoverflow has many posts about adding regular expressions to Excel using VBA. For your particular example, you would need VBA to actually move a substring from one cell to another.
If you simply want to copy the substring, you can do so easily using the MID function:
=IFERROR(MID(A1,FIND("=",A1)+1,999),A1)
I used 999 to ensure that enough characters were grabbed.
IFERROR returns the cell as-is if an equals sign is not found.
To return the portion of string before the equals sign, do this:
=LEFT(A1,FIND("=",A1&"=")-1)
In this case, I appended the equals sign to A1, so FIND won't return an error if not found.
You can use the Text to Column functionality of MS-Excel giving '=' as delimiter.
Refer to this link:
Chop text in column to 60 charactersblocks
You can simply use Text to Column feature of excel for this:
Follow the below steps :
1) Select Column A.
2) Goto Data Tab in Menu Bar.
3) Click Text to Column icon.
4) Choose Delimited option and do Next and then check the Other options in delimiter and enter '=' in the entry box.
5) Just click finish.
Here are URL for Text to Column : http://www.excel-easy.com/examples/text-to-columns.html

Stata: removing line feed control characters

I have a dataset which I export with command outsheet into a csv-file. There are some rows which breaks line at a certain place. Using a hexadecimal editor I could recognize the control character for line feed "0a" in the record. The value of the variable producing the line break shows visually (in Stata) only 5 characters. But if I count the number of characters:
gen xlen = length(x)
I get 6. I could write a Perl programm to get rid of this problem but I prefer to remove the control characters in Stata before exporting (for example using regexr()). Does anyone have an idea how to remove the control characters?
The char() function calls up particular ASCII characters. So, you can delete such characters by replacing them with empty strings.
replace x = subinstr(x, char(10), "", .)