How to display large numbers in their abbreviated form? [duplicate] - if-statement

This question already has answers here:
ultimate short custom number formatting - K, M, B, T, etc., Q, D, Googol
(3 answers)
Closed 1 year ago.
I want to make a number format in Google Sheets that turns large numbers into their abbreviated form. Example: "1 200" -> "1.2k", "1 500 000 000 000 000" (one point five quadrillions) -> "1.5Qa". I have absolutely no idea on how would that look.
Thanks in advance.

this should cover your needs:
=ARRAYFORMULA(IF(A:A<10^3, A:A,
IF(1*A:A<10^6, TEXT(A:A/10^3, "#.0\k"),
IF(1*A:A<10^9, TEXT(A:A/10^6, "#.0\M"),
IF(1*A:A<10^12, TEXT(A:A/10^9, "#.0\B"),
IF(1*A:A<10^15, TEXT(A:A/10^12, "#.0\T"),
IF(1*A:A<10^18, TEXT(A:A/10^15, "#.0\Q\a"),
IF(1*A:A<10^21, TEXT(A:A/10^18, "#.0\Q\i"),
IF(1*A:A<10^24, TEXT(A:A/10^21, "#.0\S\x"),
IF(1*A:A<10^27, TEXT(A:A/10^24, "#.0\S\p"),
IF(1*A:A<10^30, TEXT(A:A/10^27, "#.0\O"),
IF(1*A:A<10^33, TEXT(A:A/10^30, "#.0\N"),
IF(1*A:A<10^36, TEXT(A:A/10^33, "#.0\D"),
IF(1*A:A<10^39, TEXT(A:A/10^36, "#.0\U"),
IF(1*A:A<10^42, TEXT(A:A/10^39, "#.0\D\d"),
IF(1*A:A<10^45, TEXT(A:A/10^42, "#.0\T\d"),
IF(1*A:A<10^48, TEXT(A:A/10^45, "#.0\Q\a\d"),
IF(1*A:A<10^51, TEXT(A:A/10^48, "#.0\Q\u\d"),
IF(1*A:A<10^54, TEXT(A:A/10^51, "#.0\S\x\d"),
IF(1*A:A<10^57, TEXT(A:A/10^54, "#.0\S\p\d"),
IF(1*A:A<10^60, TEXT(A:A/10^57, "#.0\O\d"),
IF(1*A:A<10^63, TEXT(A:A/10^60, "#.0\N\d"),
IF(1*A:A<10^66, TEXT(A:A/10^63, "#.0\V"),
IF(1*A:A<10^69, TEXT(A:A/10^66, "#.0\C"), ))))))))))))))))))))))))

Use a custom number format
Select the range of cells you want to convert
Go to Format -> Number -> More Formats -> Custom number format
Paste into the input field [>999999]#,,"M";#,"K"
Click on Apply - Done

I do not think it is possible to configure more than two formats of a cell to adapt dynamically according to the number inside it without some scripting. That would be nice as it would preserved the number type.
But if there is no need to preserve the number type and string is acceptable, then strings could be generated like this using TEXT function and dynamically setting format for the number based on a reference:
=INDEX(
TEXT(
E2:E24,
"0.0"
& IFNA(
REPT(",", (VLOOKUP(INT(LOG10(E2:E24)), $C$2:$C$8, 1, TRUE)) / 3)
& "\" & VLOOKUP(INT(LOG10(E2:E24)), {$C$2:$C$8, $A$2:$A$8}, 2, TRUE)
)
)
)
On the left you can see a reference columns where I used symbols from wiki.

Related

How to simplify this google sheets regex sequence?

I want to make the following transformation to a set of datas in my google spreadsheets :
6 views -> 6
73K views -> 73000
3650 -> 3650
163K views -> 163000
1.2K views -> 1200
52.5K -> 52500
All the datas are in a column and depending on the case I need to apply a specific transformation.
I tried to put all the regex in one formula but I failed. I always had a case over two regular expressions etc.
Anyaway I end up making these regex one case by one case in different columns. It works fine but I feel like it could slowdown the sheet since I except a lot of data coming into this sheet.
Here is the sheet : spreadsheet
Thank you for your help !
Use regexreplace(), like this:
=arrayformula(
iferror( 1 /
value(
regexreplace(
regexreplace(trim(A2:A), "\s*K", "e3"),
" views", ""
)
)
^ -1 )
)
See your sample spreadsheet.
replace 'views' using regex: /(?<=(\d*\.?\d+\K?)) views/gi
To replace 'K' with or without decimal value, first, detect K then replace K with an empty string and multiply by 1000.
use call back function as:
txt.replace(/(?<=(\d*\.?\d+\K?)) views/gi, '').replace(/(?<=\d)\.?\d+K/g, x => x.replace(/K/gi, '')*1000)
code:
arr = [`6 views`,
`73K views`,
`3650`,
`163K views`,
`1.2K views`,
`52.5K`];
arr.forEach(txt => {
console.log(txt.replace(/(?<=(\d*\.?\d+\K?)) views/gi, '').replace(/(?<=\d)\.?\d+K/g, x => x.replace(/K/gi, '')*1000))
})
Output:
6
73000
3650
163000
1200
52500
Say your inputs are in column A. Empty cells allowed. In any other column,
=arrayformula(if(A2:A<>"",value(substitute(substitute(A2:A," views",""),"K","e3")),))
works.
Adjust the range A2:A as needed.
Also note that non-empty cells with empty strings are ignored.
Basically, since Google Sheet's regex engine doesn't support look around, it is more efficient to take advantage of the rather strict patterns in your application and use substitute() instead.

Parsing a name from a complex string in Tableau

I have a series of values in Tableau that are long strings intermixed with letters and numbers. I am unable to control the data output, but would like to parse the names from these strings. They follow the following format:
Potato 1TByte 4.5 NFA
Board 256GByte 553 NCA
Launch 4 512GByte 4.5 NFA
Launch 4S 512GByte 4.5 NCA
From each of these, I am attempting to capture the following:
"Potato"
"Board"
"Launch 4"
"Launch 4S"
Each string follows the same format: the name, followed by size, followed by some extra information we don't really care about.
I've tried to put together some text parsing strings, but am coming up short, and am still trying to learn regular expressions.
The Tableau calculated field I was trying to work with was something like the following:
LEFT([String], FIND([String], "Byte") - 2)
The issue is that the text and numbers preceding Byte can be anywhere from 4 to 2 characters and I need a way to identify the length of that.
Any help would be greatly appreciated!
One option which uses a regex replacement:
REGEXP_REPLACE('Launch 4 512GByte 4.5 NFA', ' \d+[A-Z]Byte .*$', '')
This strips off everything from the Byte term to the right, leaving us with only the product name.
You could try the following - this seems to work - Screenshot of Tableau output. Find below the formulas for the various derived columns you see in the screenshot (Your source column is called [Name])
Step1 = LEFT([Name],FIND([Name],"Byte")-1)
Step2 = LEN([Step1])-LEN(REPLACE([Step1]," ",""))
Step3 = FINDNTH([Step1]," ",[Step2])
Step4 = LEFT([Step1],[Step3]-1)
And of course you can nest all these in a single calculated field - kept them as separate columns for easier understanding

Data validation using regular expressions in Google Sheets

I am using the below date/time format in gSheets:
01 Apr at 11:00
I wonder whether it is possible to use Data Validation (or any other function) to report error (add the small red triangle to the corner of the cell) when the format differs in any way.
Possible values in the given format:
01 -> any number between 01-31 (but not "1", there must be the leading zero)
space
Apr -> 3 letters for month (Jan, Feb, Mar... Dec)
space
at
space
11 -> hours in 24h format (00, 01...23)
:
00 -> minutes (00, 01,...59)
Is there any way to validate that the cell contains "text/data" exactly in the above mentioned format?
The right way to do this is using Regular Expression and "regexmatch()" function in Google Sheets. For the given example, I made the below regular expression:
[0-3][0-9] (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) at [0-2][0-9]\:[0-5][0-9]
Process:
Select range of cells to be validated
Go to Data > Data Validation
Under Criteria select "Own pattern is" (not sure the exact translation used in EN)
Paste: =regexmatch(to_text(K4); "[0-3][0-9] (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) at [0-2][0-9]\:[0-5][0-9]")
Make sure that instead of K4 in "to_text(K4)" there is a upper-left cell from the selected range
Save
Hope it helps someone :)
You may try the formula for data validation:
=not(iserror(SUBSTITUTE(A1," at","")*1))*(len(A1)=15)*(right(A1,2)*1<61)
not(iserror(SUBSTITUTE(A1," at","")*1)) checks all statemant is legal date
(len(A1)=15) checks dates are entered with 2 digits
(right(A1,2)*1<61) cheks too much minutes, for some reason 01 Apr at 11:99 is a legal date..
Select the range of fields, where you need the data validation to occur to.
Press on -> Data -> Data validation
For "Criteria" select "Custom formula is"
Enter the following in the textfield next to "Custom formula is":
=regexmatch(Tablename!B2; "^[a-z_]*$")
Where as "Tablename" should be replaced by the table name and "B2" should be replaced by the first cell of the range.
Inside the "" you enter then your regex-expression. Here this would allow only small letters and underscores.
Using the to_text() function additionally didn't work for me. So you should maybe avoid it in order to make sure, that it works.
Press save

Make =IF Function Output Numbers For "Scoring": Google Sheets

I'm am exploring methods of giving scores to different datapoints within a dataset. These points come from a mix of numbers and text string attributes looking for certain characteristics, e.g. if Col. A contains more than X number of "|", then give it a 1. If not, it gets a 0 for that category. I also have some that give the point when the value is >X.
I have been trying to do this with =IF, for example, =IF([sheet] = [Text], "1","0").
I can get it to give me 1 or 0, but I am unable to get a point total with sum.
I have tried changing the formatting of the text to both "number", "plain text", and have left it as automatic, but I can't get it to sum. Thoughts? Is there maybe a better way to do this?
FWIW - I'm trying to score based on about 12 factors.
Best,
Alex
The issue here might be that you're having the cell evaluate to either the string "0" or the string "1" rather than the number 0 or the number 1. That would explain why you're seeing the right things but the math isn't coming out right - the cell contents look like numbers, but they're really text, which the summation would then ignore.
One option would be to drop the quotation marks and write something like this:
=IF(condition, 1, 0)
This has the condition evaluate to 1 if it's true and 0 if it's false.
Alternatively, you could write something like this:
=(condition) * 1
This will take the boolean TRUE or FALSE returned by condition and convert it to either the numeric value 1 (true) or the numeric value 0 (false).

Multiple values for one weka attribute

Apologies as I'm a complete novice when it comes to Weka.
I have 100 instances and each instance has 400 attributes most of which have a single value. However some attributes have multiple values as they contain a time component. I was wondering if Weka can analyse multiple values for one attribute and if so, how do I separate these values so that weka can read them (e.g. commas, semi-colons?)
Many Thanks for your help
R
Weka natively works with a format called arff acronym for Attribute-Relation
File Format. This format consists of a clearly differentiated structure in three parts:
1.Head. Here, the name of the relationship is defined. Its format is as follows:
relation <name-of-the-relationship>
Where is of type String. If this name contains some
space will be put between quotation marks.
2. Statements of attributes. This section describes the attributes that make up our file with his type are declared. The syntax is:
attribute <attribute-name> <type>
Where it is of type String having the same restrictions
as above.
Weka accepts various types, these are:
a) NUMERIC. Real numbers*
b) INTEGER.
c) DATE. Dates, to do this kind should be preceded by a label quoted format.
The label format is composed of separator characters (hyphens
and / or spaces) and time units:
dd Day.
MM Month.
yyyy Year.
HH Hours.
mm minutes.
ss seconds.
d) STRING.. With the restrictions of the type String commented
previously.
e) LISTED The identifier of this type is to express in braces and separated
Comma possible values ​​(or character strings) that can take
attribute. For example, if we have an attribute that indicates the time could be defined:
attribute time {sunny, rainy, cloudy}
3. Data Section. Declare the data that make up the relationship between commas separating the attributes and line breaks relationships.
data
4,3.2
Although this is the "full" mode it is possible to define the data in a short form (sparse data). If we have a sample in which there are many data we can express 0 Data, omitting those items that are zero, surrounding each of the rows in braces and placing in front of each of the data the attribute number.
An example of this is as follows:
data
{14 1, 3 3}
In the event that any of the information is unknown is expressed with a symbol of close interrogation ("?"). And if you want to add comments, use the character %.
So, you can use several values to contruct your dataset.
Example:
1 % Test Weka.
2 #relation MyTest
3
4 #attribute nombre STRING
5 #attribute ojo_izquierdo {Bien,Mal}
6 #attribute dimension NUMERIC
7 #attribute fecha_analisis DATE "dd-MM-yyyy HH:mm"
8
9 #data
10 Antonio,Bien,38.43,"12-04-2003 12:23"
11 ’Maria Jose’,?,34.53,"14-05-2003 13:45"
12 Juan,Bien,43,"01-01-2004 08:04"
13 Maria,?,?,"03-04-2003 11:03"