Subject ID Condition Task A Task B First Task
1001 1
1002 2
1003 1
This is a within-subjects design. Each participant took part in tasks A and B, however, the order in which the tasks were presented (first or second) depends upon condition (e.g., those in condition 1 perform task A first followed by task B and vice-versa). Note that the task columns do have their own score but I cannot add here.
Is it possible to produce an elegant piece of code that mutates a new column/variable called 'first task' that selects cases when a subject belonging to condition 1, their corresponding score from task A is put into this new 'first task' column. Those subjects belonging to condition 2, their score from task B is put into the first task column (because those in condition 2 received task B first).
I hope this makes sense. I am trying to combine mutate with cases_when, group_by and if/if_else functions to achieve something like this, but have not succeeded.
Related
The dataframe above represents a repeated-measures design, each participant took part in both task A and B. The condition determines which order the tasks occurred - if in condition 1, then Task A came first followed by task B, and vice-versa for condition 2.
I would like to mutate a new column in my dataframe called 'First Task'. This column must represent the scores from the task that always occurred first. For example, participant 1001 was in condition 1, so their score from task A should go into this first task column. For participant 1002, in condition 2, their score from task B should go into the first task column, and so on.
After scouring possible threads (which have always solved every need I have!) I considered using the mutate function, combined with cases_when (group == 1), and thereafter I am not sure how to properly pipe something along the lines of select score from TASK A. Alternatively, I considered how I may go about using if or ifelse, probably the likely piece of code to execute something like this?
It would be an elegant piece of coding like this I am after, as opposed to re-creating a new dataframe . I would greatly appreciate any thoughts or ideas on this. Let me know if this is clear (note I have simplified this image as a example to make the question clearer).
Many Thanks community
An example, a table with a column counter with only 1 row. I want to update this row's counter value based on its current value.
Say if counter is >= 10, set it to 0, otherwise counter++. How to achieve this if else clause in TransactWrite?
I can't have two actions in one transaction because Documentation states that it does not allow more than 1 action on the same item.
And of course, the reason I use TransactWrite is because there will be multiple lambda doing this task in parallel.
You cannot do it in one request, transactional or otherwise.
You can get the item, decide what to do, and then update the item in a second request accordingly. You’ll want to keep a version number or timestamp attribute to make sure the item hasn’t changed between the read and write, and use a condition expression to fail if it has.
That’s a common idiom:
https://dynobase.dev/dynamodb-locking/
You can't do it the way you like, because if/else is not supported in transactions. There is however a simpler solution.
Just Update the item and increment the counter. Whenever you read the value, you take the counter value modulo 10, and you get the desired behavior, e.g. 123 % 10 = 3 or 10 % 10 = 0.
I am creating a step function where my input is in the form of array like: {"ids": [1, 2, 3]}. Next, I have 2 Glue jobs that I want to execute for these ids. E.g. Glue job 1 will execute with id 1 and Glue job 2 will execute with id 2 and then Glue job 1 would execute with id 3 when it will process the job with id 1. I have tried using Parallel state in Step function, but that does not work on chunk of input but takes complete ids list as input. I have thought of using Map state, but Map state takes only one task to execute in parallel, but in my case I have 2 Glue jobs.
What could be the resolution for this? Please suggest a solution using Step function.
What if you spilt your id2 to two arrays first (your first step). So convert
{"ids": [1, 2, 3, 4, 5]}
To
{
ids1:[1,3,5]
ids2:[2,4]
}
Then add a step with two parallel step, each contain a map, one for iterate over ids1 and send them to Glue Job1 and the.other to iterate over ids2 and send them to Glue Job2
Update 1:
If you don't want any Glue job finishs sooner and becomes idle then instead of two array you can keep one list but add a status to each row:
{
id:1,
Status: null | job1 | job2
}
And instead of map state for each job, create a while loop, first pick an item from the list and then call Glue job.
So your Select_an_id state will chose one id from that list. and change the status for that record. You need to create a Lambda task state to do this.
I was having an issue with ms access 2000 in which I try to enter the same field in a query multiple times and it only displays the field once. As in if I entered the field with the number being (for example) 8150 multiple times, it would only display it once.
This image shows the query.
I've already checked everything on ms access 2000 to try to resolve this issue but I've come up with nothing suitable.
I know your data set is simplified, but looking at your data, inputs, etc, it appears your query is pulling from a single table and repeating results -- so there is no join consideration.
I think the issue is your DISTINCTROW in the query, which is removing all duplicate values.
If you remove the "DISTINCTROW," I believe it may give you what you are expecting. In other words, change this:
SELECT DISTINCTROW Ring.[Ring Number], Ring.[Mounting Weight]
FROM Ring
To this:
SELECT Ring.[Ring Number], Ring.[Mounting Weight]
FROM Ring
For what it's worth, there may also be some strategies to simplifying how this query is run in the future (less dependence on dialog box prompts), but I know you probably want to address the issue at a hand first, so let me know if this doesn't do it.
-- EDIT --
The removal of distinct still applies, but I suddenly see the problem. The query is depicting the logic as "OR" of multiple values. Therefore, repeating the value does not mean multiple rows, it just means you've repeated a true condition.
For example, if I have:
Fruit Count
------ ------
Apple 1
Pear 1
Kiwi 3
and I say select where Fruit is Apple or Apple or Apple or Apple, the query is still only going to list the first row. Once the "Or" condition matches true, short-circuiting kicks in, and no other conditions matter.
That does not sound like what you want.
Here's what I think you need to do:
Get rid of the prompts within the query
Load your options into a separate table -- the repetition can occur here
Change your query to perform an inner join on the new table
New table (named "Selection" for the sake of example):
Entry Ring Number Mounting Weight
----- ----------- ----------------
1 8105 you get the idea...
2 8110
3 8110
4 8110
5 8115
6 8130
7 8130
8 8130
9 8130
10 8150
New Query:
select
Ring.[Ring Number], Ring.[Mounting Weight]
from
Ring
Inner join Selection on Ring.[Ring Number] = Selection.[Ring Number]
This has the added advantage of allowing more (or less) than 10 records
I have created an implementation of the Hungarian algorithm in C++. This implementation works very well for many cases. However there are some cases that my algorithm does not work at all because I believe(and it's true) that my implementation of one step of the algorithm is wrong.
My implementation takes as an input the array X, runs the steps of the algorithm and produces the final assignment.
The steps of the algorithm can be found on wiki:Hungarian Algorithm
In step 3 it has the following array of costs(workers are represented by rows and jobs by columns)
and then it says
Initially assign as many tasks as possible then do the following
However I don't understand what a correct implementation of this would be. How can you assign as many tasks as possible? Would the choice be random? Then if the choice would be random, I could choose the first worker to take the first job, the second worker to take the fourth job and the fourth worker to take the second job. So the second worker is left out. However in wikipedia the authors took a different approach. The third worker has to take the first job, the second worker has to take the second job and the fourth worker has to take the second job. So the first worker is left out.
The problem with doing such random actions is the following case:
Suppose while we run the algorithm and doing our arithmetic operations on the input, before assigning as many tasks to workers as possible we have the following cost matrix:
2 2 0 3
6 1 6 0
0 0 6 1
0 3 5 3
If I choose randomly to assign the third job to the first worker, the fourth job to the second worker and then the first job to the third worker, I will have the fourth worker left out. But for the algorithm to work correctly we need to assign as many jobs to workers as possible. Is this the case here? No, because if instead of assigning the first job to the third worker I assigned the first job to the fourth worker, I could then assign the second job to the third worker and thus the algorithm not only would assign as many jobs to workers as possible but it would find the optimum result.
Conclusion: Doing random assignments is not a good approach.
I searched about this a little bit and I found the following lecture:
http://www.youtube.com/watch?v=BUGIhEecipE
In this lecture the professor suggests a different approach for the problem of assigning as many tasks as possible.
According to him if any row or column has exactly one zero, we will make an assignment. So starting from the first row you check to see if the first row
has only one zero, if that's the case, make an assignment. Otherwise ignore that row and go to the second row, do the same thing
repeatedly by rescanning the table until all the zeros are covered due to assignments.
By following this approach, one can see that the previous case is solved. What we do is, we assign the third job to the first worker, the fourth job to the second worker, then we see that the third worker can take 2 jobs so we ignore him for a while, we assign the first job to the fourth worker and then return in order to assign the second job to the third worker.
My implementation follows this logic, however again, it does not solve all the cases.
Let's take for example the following case:
0 0 0 0
0 0 0 0
0 0 4 9
0 0 2 3
The first worker can take 4 jobs, the second 4, the third 2 and the fourth 2. So my implementation does no assignments, because I would need at least one worker who can only take one job in order to do an assignment and then continue by re scanning the table.
So what do I do in this case? Arbitrary assignments would be a bad thing to do, unfortunately nothing is suggested in that lecture.
I could only think of the following:
For each worker have a counter whose value indicates the amount of tasks that can be assigned to him, so how many zeros do we have in that row? that's the value of the counter.
Then start assigning arbitrary tasks to the worker with the smallest counter. So in this case, the array of counters for each workers would include the following values:
4
4
2
2
I would choose for example the third worker and arbitrary assign to him the first job. The new counters would be:
3
3
0
1
I would then choose the fourth worker and do the only assignment available for him(which is the second job). The new counters would be:
2
2
0
0
Then I could choose either the first worker or the second. I would do an arbitrary assignment for the first worker and give him the third job. The counters would be
1
0
0
0
Finally I would give the fourth assignment to the first job.
So the final assignments:
0 0 0 *
0 0 * 0
* 0 4 9
0 * 2 3
It seems like a good approach, however I'm afraid there might be a special case that this method would not work. How can I verify whether this approach would work for all cases, and if it won't, what approach would solve my problem completely?
Thank you in advance
Your current approach does not work.
0 2 0
3 0 0
4 0 0
Your method: " Then start assigning arbitrary tasks to the worker with the smallest counter. " All workers have the same counter, so say you pick worker 1 and assign him to task 3, you can only match one of the remaining workers, while with this matrix you could obviously match all three.
What you need is a maximum bipartite matching between those workers and tasks, where a pair is matchable if there is a 0 in the relevant position. Such a matching can be found by manually going through augmenting paths or more quickly by using the Hopcroft-Karp algorithm.