R Studio: Mutating a column variable based on two selection conditions

R Studio: Mutating a column variable based on two selection conditions - if-statement

The dataframe above represents a repeated-measures design, each participant took part in both task A and B. The condition determines which order the tasks occurred - if in condition 1, then Task A came first followed by task B, and vice-versa for condition 2.
I would like to mutate a new column in my dataframe called 'First Task'. This column must represent the scores from the task that always occurred first. For example, participant 1001 was in condition 1, so their score from task A should go into this first task column. For participant 1002, in condition 2, their score from task B should go into the first task column, and so on.
After scouring possible threads (which have always solved every need I have!) I considered using the mutate function, combined with cases_when (group == 1), and thereafter I am not sure how to properly pipe something along the lines of select score from TASK A. Alternatively, I considered how I may go about using if or ifelse, probably the likely piece of code to execute something like this?
It would be an elegant piece of coding like this I am after, as opposed to re-creating a new dataframe . I would greatly appreciate any thoughts or ideas on this. Let me know if this is clear (note I have simplified this image as a example to make the question clearer).
Many Thanks community

Related

How to automatically feed a cell value from a range of values, based on its matching condition with other cell value

I'm making a time-spending tracker based on the work I do every hour of the day.
Now, suppose I have 28 types of work listed in my tracker (which I also have to increase from time to time), and I have about 8 significance values that I have decided to relate to these 28 types of work, predefined.
I want that, as soon as I enter a type of work in cell 1 - I want the adjacent cell 2 to get automatically populated with a significance value (from a range of 8 values) that is pre-definitely set by me.
Every time I input a new or old occurrence of a type of work, the adjacent cell should automatically get matched with its relevant significance value & automatically get populated in real-time.
I know how to do it using IF, IFS, and IF_OR conditions, but I feel that based on the ever-expanding types of work & significance values, the above formulas will be very big, complicated, and repetitive in the future. I feel there's a more efficient way to achieve it. Also, I don't want it to be selected from a drop-down list.
Guys, please help me out with the most efficient way to handle this. TUIA :)
Also, I've added a snapshot and a sample sheet describing the problem.
Sample sheet

XLOOKUP() may work. Try-
=XLOOKUP(D2,A2:A,B2:B)
Or FILTER() function like-
=FILTER(B2:B,A2:A=D2)

You can use this formula for a whole column:
=INDEX(IFERROR(VLOOKUP(C14:C,A2:B9,2,0)))
Adapt the ranges to your actual tables in order to include in the second argument all the potential values and their significances

This is the formula, that worked for me (for anybody's reference):
I created another reference sheet, stating the types of work & their significance. From that sheet, I'm using either vlookup, filter, xlookup.Using gforms for inputting my data.
=ARRAYFORMULA(IFS(ROW(D:D)=1,"Significance",A:A="","",TRUE,VLOOKUP(D:D,Reference!$A:$B,2,0)))

Creating new column based on two selection conditions

Subject ID Condition Task A Task B First Task
1001 1
1002 2
1003 1
This is a within-subjects design. Each participant took part in tasks A and B, however, the order in which the tasks were presented (first or second) depends upon condition (e.g., those in condition 1 perform task A first followed by task B and vice-versa). Note that the task columns do have their own score but I cannot add here.
Is it possible to produce an elegant piece of code that mutates a new column/variable called 'first task' that selects cases when a subject belonging to condition 1, their corresponding score from task A is put into this new 'first task' column. Those subjects belonging to condition 2, their score from task B is put into the first task column (because those in condition 2 received task B first).
I hope this makes sense. I am trying to combine mutate with cases_when, group_by and if/if_else functions to achieve something like this, but have not succeeded.

What is the moment when the sales order is fully formed with its lines and the entire posting process is finished?

I would like to know where in Dynamics ax 2009 a sales order (SALESTABL) is fully formed with all its lines (SALESLINE). I need to run a code snippet right after the order (and its lines) has been created. What I want to learn is actually like "after insert" in SQL. The piece of code I want to run will look at all the lines of the order in order and do an action.

That's sort of impossible, because a SO can be reopened or if you have an open SO and your user has added 1 line...well who knows if the user is done adding lines? How do you know when it's "complete"? You could go off of the point of confirmation, but not everyone does confirmations. You need to determine some "point" in business that you consider "complete" and then go from there. You could trigger an event after each SO line is created if your code is written to be idempotent?

How can I resolve INDEX MATCH errors caused by discrepancies in the spelling of names across multiple data sources?

I've set up a Google Sheets workbook that synthesizes data from a few different sources via manual input, IMPORTHTML and IMPORTRANGE. Once the data is populated, I'm using INDEX MATCH to filter and compare the information and to RANK each data set.
Since I have multiple data inputs, I'm running into a persistent issue of names not being written exactly the same between sources, even though they're the same person. First names are the primary culprit (i.e. Mary Lou vs Marylou vs Mary-Lou vs Mary Louise) but some last names with special symbols (umlauts, accents, tildes) are also causing errors. When Sheets can't recognize a match, the INDEX MATCH and RANK functions both break down.
I'm wondering how to better unify the data automatically so my Sheet understands that each occurrence is actually the same person (or "value").
Since you can't edit the results of an IMPORTHTML directly, I've set up "helper columns" and used functions like TRIM and SPLIT to try and fix instances as I go, but it seems like there must be a simpler path.
It feels like IFS could work but I can't figure how to integrate it. Also thinking this may require a script, which I'm just beginning to study.
Here's a simplified example of what I'm trying to achieve and the corresponding errors: Sample Spreadsheet
The first tab is attempting to pull and RANK data from tabs 2 and 3. Sample formulas from the Summary tab, row 3 (Amelia Rose):
Cell B3: =INDEX('Q1 Sales'!B:B, MATCH(A3,'Q1 Sales'!A:A,0))
Cell C3: =RANK(B3,$B$2:B,1)
Cell D3: =INDEX('Q2 Sales'!B:B, MATCH(A3,'Q2 Sales'!A:A,0))
Cell E3: =RANK(D3,$D$2:D,1)
I'd be grateful for any insight on how to best index 'Q2Sales'!B3 as the correct value for 'Summary'!D3. Thanks in advance - the thoughtful answers on Stack Overflow have gotten me this far!

to counter every possible scenario do it like this:
=ARRAYFORMULA(IFERROR(VLOOKUP(LOWER(REGEXREPLACE(A2:A, "-|\s", )),
{REGEXEXTRACT(LOWER(REGEXREPLACE('Q2 Sales'!A2:A, "-|\s", )),
TEXTJOIN("|", 1, LOWER(REGEXREPLACE(A2:A, "-|\s", )))), 'Q2 Sales'!B2:B}, 2, 0)))

dax code for count and distinct count measures (or calculated columns)

I hope somebody can help me with some hints for the following analysis. The students may do some actions for some courses (enroll, join, grant,...) and also the reverse - to cancel the latest action.
The first metric is to count all the action occurred in the system between two dates - these are exposed like a filter/slicer.
Some sample data :
person-id,person-name,course-name,event,event-rank,startDT,stopDT
11, John, CS101, enrol,1,2000-01-01,2000-03-31
11, John, CS101, grant,2,2000-04-01,2000-04-30
11, John, CS101, cancel,3,2000-04-01,2000-04-30
11, John, PHIL, enrol, 1, 2000-02-01,2000-03-31
11, John, PHIL, grant, 2, 2000-04-01,2000-04-30
The data set (ds) is above and I have added the following code for the count metric:
evaluate
sumx(
addcolumns( ds
,"z+", if([event] <> "cancel",1,0)
,"z-", if([event] = "cancel",-1,0)
)
,[z+] + [z-])
}
The metric should display : 3 subscriptions (John-CS101 = 1 , John-PHIL=2).
There are some other rules but I don't know how to add them to the DAX code, the cancel date is the same as the above action (non-cancel) and the rank of the cancel-action = the non-cancel-action + 1.
Also there is a need for adding the number for distinct student and course, the composite key . How to add this to the code, please ? (via summarize, rankx)
Regards,
Q

This isn't technically an answer, but more of a recommendation.
It sounds like your challenge is that you have actions that may then be cancelled. There is specific logic that determines whether an action is cancelled or not (i.e. the cancellation has to be the immediate next row and the dates must match).
What I would recommend, which doesn't answer your specific question, is to adjust your data model rather than put the cancellation logic in DAX.
For example, if you could add a column to your data model that flags a row as subsequently cancelled, then all DAX has to do is check that flag to know if an action is cancelled or not. A CALCULATE statement. You don't have to have lots of logic to determine whether the event was cancelled. You entirely eliminate the need for SUMX, which can be slow when working with a lot of rows since it works row by row.
The logic for whether an action is cancelled or not moves to your source system (e.g. SQL or even a calculated column in Excel), or to your ETL (e.g. the Query Editor in Power BI) which are better equipped for such tasks. The logic is applied 1 time and then exists in your data model for all measures, instead of needing to apply the logic each time a measure is used.
I know this doesn't help you solve your logic question, but the reason I make this recommendation is that DAX is fundamentally a giant calculator. It adds things up. It's great at filters (adding some things up but not others), but it works best when everything is reduced to columns that it can sum or count. Once you go beyond that (e.g. wanting to look at the row below to adjust something about the current row), your DAX is going to get very complicated (and slow), whereas a source system or the Query Editor will likely be able to handle such requirements more easily.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js