Diff 2 sets of rows in Google Sheets

Diff 2 sets of rows in Google Sheets - if-statement

Is there a simple way of diff'ing 2 sets of rows of data in Google Sheets?
e.g.
Sheet 1 - contains 10 rows
One
Two
Three
Four
Five
Six
Seven
Eight
Nine
Ten
Sheet 2 contains 13 rows
One
Three
Five
Six
Seven
Eight
Nine
Ten
Eleven
Twelve
Thirteen
Fourteen
Fifteen
Ideally i'd like to be able to run some formula to diff the two data sets to identify the additions and deletions in the second data set.

try:
=FILTER(A:A, COUNTIF(B:B, A:A))
and:
=FILTER(A:A, NOT(COUNTIF(B:B, A:A)))
or interchange A:A & B:B

One option is to match values in both sets and then check if all have been found or not.
=AND(ARRAYFORMULA(IF(Sheet5!A1:A="",,IF(ISERROR(VLOOKUP(Sheet5!A1:A,Sheet6!A:A,1,false)),FALSE,TRUE))))
This formula will check if every value in Sheet5 is present in Sheet6.
If all of them have been found, then the result will be TRUE.

Related

Google Sheets - perform mathematical operations after extracting numeric data from strings with specific text

I need a formula that performs a specific mathematical operation, but only with the number that meets specific conditions. In this case – with numbers extracted from strings with specific text in them.
In the first column we have some raw data: a string with different numbers and text divided by the underscore. I need to split this data into several different rows and use the following formula for this: =TRANSPOSE(SPLIT(A3,"_"))
The next column should only contain numbers, but the problem is that one of these numbers (which contains "tb" in this specific example) should be divided or multiplied by the specific number (multiplied by 1000 in this case).
I've tried the following formula which only works as long as there is no "tb" or if it's in the very beginning of the string: =IF(REGEXMATCH(A6,"tb"),REGEXEXTRACT(A6,"(\d+)tb")*1000,REGEXEXTRACT(B6,"(\d+)"))
If it's somewhere in the middle or at the end of the string only the first number still undergoes the math operation instead. I wonder if there's a way to achieve the result I want without resorting to complex formulas (I'm very new to this and would ideally like to use formulas that I can understand and easily modify for other similar tasks). A sample table for better visualisation can be seen below. Thanks in advance!
Raw data
Split data
Extracted numbers (what I get)
Desired outcome
5tb_200gb_300mb
5tb
5000
5000
200gb
200
200
300mb
300
300
2tb_500gb_50mb
2tb
2000
2000
500gb
500
500
50mb
50
50
500gb_50mb_2tb
500gb
2000
500
50mb
50
50
2tb
2
2000

Try not putting tb in the first REGEXEXTRACT:
=IF(REGEXMATCH(A6,"tb"),REGEXEXTRACT(A6,"(\d+)")*1000,REGEXEXTRACT(B6,"(\d+)"))
EDIT
Option 2: to extract the numbers adjacent before "tb"
=IF(REGEXMATCH(A6,"tb"),REGEXEXTRACT(A6,"(\d+)tb")*1000,REGEXEXTRACT(B6,"(\d+)"))

Trying to find Top 10 products within categories through Regex

I have a ton of products, separated into different categories.
I've aggregated each products revenue, within their category and I now need to locate the top 10.
The issue is, that not every product have sold within a given timeframe, or some category doesn't even have 10 products, leaving me with fewer than 10 values.
As an example, these are some of the values:
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,3,3,5,6,20,46,47,53,78,92,94,111,115,139,161,163,208,278,291,412,636,638,729,755,829,2673
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,57,124,158,207,288,547
0,0,90,449,1590,10492
0
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,7,12,14,32,32,37,62,64,64,64,94,100,103,109,113,114,114,129,133,148,152,154,160,167,177,188,205,207,207,209,214,214,224,225,238,238,244,247,254,268,268,285,288,298,301,305,327,333,347,348,359,362,368,373,402,410,432,452,462,462,472,482,495,511,512,532,566,597,599,600,609,620,636,639,701,704,707,728,747,768,769,773,805,833,899,937,1003,1049,1150,1160,1218,1230,1262,1327,1377,1396,1474,1532,1547,1565,1760,1768,1836,1962,1963,2137,2293,2423,2448,2451,2484,2529,2609,3138,3172,3195,3424,3700,3824,4310,4345,4415,4819,4943,5083,5123,5158,5334,5734,6673,7160,7913,9298,9349,10148,11047,11078,12929,18535,20756,28850,63447
63,126
How would you get as close as possible to capturing the top 10 within a category, and how would you ensure that it is only products that have sold, that are included as a possibility? And all of this through Regex.
My current setup is only finding top 3 and a very basic setup:
Step 1: ^.*\,(.*\,.*\,.*)$ finding top 3
Step 2: ^(.*)\,.*\,.*$ finding the lowest value of the top 3 products
Step 3: Checking if original revenue value is higher than, or equal to, step 2 value.
Step 4: If yes, then bestseller, otherwise just empty value.
Thanks in advance

You didn't specify a programming language so I'm going with Javascript here but this regex is quite compatible with almost any regex flavor:
(?:[1-9]\d*,){0,9}[1-9]\d*$
(?:[1-9]\d*,){0,9} - between 0 and 9 times, find numbers followed by a comma; ignore zero revenue
[1-9]\d* - guarantee a non-zero revenue one time
$ - end line anchor
https://regex101.com/r/1xBQD3/1
If your data were to have leading zeros like 0,0,00090,00449,01590,10492 for some reason then you would need this regex which is 33% more expensive:
(?:0*[1-9]\d*,){0,9}0*[1-9]\d*$

(C++) How do you generate a seven digit number, whose digits, when added together, equal to a multiple of seven?

I know this sounds like a very specific question, but I am making a key generator for a program, and one part of the key has seven digits that need to be a multiple of seven when added together. How do you achieve this?

Generate a random 6-digit number then choose the number for the 7th digit that makes the algorithm work.
See any example of how to use std::rand for generating the random digits.

Once you have generated the 7 numbers for the key. You could than try a %7.
If the result is 0, than it mean that your number generated is a multiple of 7.
If the result is not 0, than you just have to regenerate another key or add 1 to the last number until modulo 7 (%7) equal 0.

Limiting decimal length in sas

I have 2 datasets which I am comparing. I have taken difference between each column in the two datasets. However SAS is returning these differences upto 15-16 decimal places. How can I limit the output to 8 decimal places.
For example I have column A in dataset 1 and Column A in dataset 2. I have created a new column newA which is data 1 A- data 2 A. The result is coming as 0.0009876543210987654. I want to see the out till 0.00098765 i.e till 8 decimal places.

Use the ROUND function, ROUND(DIFFVAR,10e-8), or format the difference variable 10.8.
Or use Proc COMPARE and the FUZZ option.

Convert alphanumeric string to 16 digit GCID

I'm building our inventory feed for Amazon Seller Central in OpenOffice Calc but can't work out how to convert our inhouse product IDs to the Amazon required format GCID.
The standard-product-id must have a specific number of characters according to type: GCID (16 alphanumeric characters), UPC (12 digit number), EAN (13 digit number) or GTIN(14 digit number).
Our product IDs vary by manufacturer, eg:-
123456
AB123456
1234AB
Where the ID is numerical only I can format the cells with leading zeros, however this doesn't work if the cell contains letters.
My file has over 10,000 products so I'm wondering if there is a formula I can apply to all cells to instantly convert them to GCID?

It seems the question was asked when under a misapprehension but having noticed that the example 123456 AB123456 1234AB represents three different IDs and aware that padding to a specified length is quite a common requirement (eg see String.PadLeft Method) a suggestion for OpenOffice might be of use to someone, one day.
Convention is to pad with 0s but since some spreadsheets automatically strip these off the front of numbers (as first example) and databases tend to prefer that fields are of consistent format I suggest separating the padding from the example with a hyphen, to aid identification of alpha numeric codes and to force text format:
=REPT(0;15-LEN(A1))&"-"&A1

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Diff 2 sets of rows in Google Sheets - if-statement

try: =FILTER(A:A, COUNTIF(B:B, A:A)) and: =FILTER(A:A, NOT(COUNTIF(B:B, A:A))) or interchange A:A & B:B

Related

Google Sheets - perform mathematical operations after extracting numeric data from strings with specific text

Trying to find Top 10 products within categories through Regex

(C++) How do you generate a seven digit number, whose digits, when added together, equal to a multiple of seven?

Limiting decimal length in sas

Convert alphanumeric string to 16 digit GCID

Categories

Resources