Check match between two sheets in Google Sheets - if-statement

I'm trying to compare values from one sheet to another and see the comparison of this in a third one. Let me share some values:
Data sheet 1:
Institution name Management Radiology Nuclear number Pathology Laboratorie Systemic Pediatric Radiotherapy Surgery Palliative
Institute 1 Log in Log in Not Completed Log in Log in Log in Log in Completed Log in
Institute 2 Not Completed Completed Not Completed Not Completed Log in Log in Log in Not Completed Completed Completed
Institute 3 Not Completed Not Completed Log in Completed Completed Log in Log in Completed
Data sheet 2:
Institution name Management Radiology Nuclear number Pathology Laboratorie Systemic Pediatric Radiotherapy Surgery Palliative
Institute 1 Not Completed Not Completed Completed Not Completed Not Completed Not Completed Not Completed Completed Not Completed
Institute 2 Not Completed Completed Completed Not Completed Not Completed Not Completed Not Completed Completed Not Completed Not Completed
Institute 3 Not Completed Not Completed Not Completed Not Completed Not Completed Not Completed Not Completed Not Completed
As we see in a few ones they match completed but in the ones that do not match I want two things in the logic:
If in sheet 2 appears "Completed" but in sheet 1 no, the result in the third one will be "Completed".
If in the sheet is not appearing completed in the cell, the result will be the text in the corresponding cell from Sheet 1.
I have my Google sheet Document with the full data and the desirable result in the sheet "doublecheck". Help please

I think you are looking for a formula like this:
=ARRAYFORMULA(
IF(
(((Track!B2:K20="Completed")*(submission!B2:K20="Completed"))=1)+(submission!B2:K20="Completed")>0;
"Completed";
Track!B2:K20))
For clarification on the logical expression please refer to this answer.
Link to Google Sheet

Related

User login and logout in Dynamodb

The requirement is to store employees login and logout information in DynamoDB.
If an employee works for more than 4 hours a day (it might be at a stretch or with breaks in between) need to notify the employee to log off for that day and take a break. Since I need to poll DB to check if 4 hours is completed for a day or no and for this I don't want to use DB scan approach. This is how my DDb looks, but in order to poll every 10 minutes, since I can't make use of scan. How can I achieve it? Any suggestions are helpful.
Employee Number
Event DateTime
Action
Worked hours
abc
24/6/2022 9AM
Login
0
abc
24/6/2022 12PM
Logout
3
abc
24/6/2022 2PM
Login
3
I suggest you set the empId as a parition key and create a global secondary index including status and status update date
Primary Index:
Partition key: empId
Global Secondary Index:
Partition key: status ('loggedOut' or 'loggedIn')
Sort Key: statusUpdateDate (epochTime in seconds)
Run a cron job in eventbridge to trigger lambda every 5 min
In the lambda function run the following query to find the employees that are loggedIn for more than 4 hours by the following query:
ddbClient.query({
TableName: 'tableName',
IndexName: "gsiName",
KeyConditionExpression: "status = :loggedIn and statusUpdateDate < :currentTimeMinus4Hours",
ExpressionAttributeValues: {
":loggedIn ": "loggedIn",
":currentTimeMinus4Hours": moment().utc().unix() - 14400, // now - 4 hours
},
})

COGNOS report poor performance when using repeater

I have a table with 3 columns: CLIENT_ID, STORE_ID and MADE_PURCHASE. Basically I'm trying to get a list of CLIENT_ID and an array of STORE_ID where a customer made a purchase. For the following data, here is the expected result:
DATA:
CLIENT_ID
STORE_ID
MADE_PURCHASE
1
a
YES
1
b
YES
1
c
YES
2
a
YES
2
b
NO
2
c
YES
3
a
NO
3
b
NO
3
c
NO
Expected result:
CLIENT_ID
STORE_ID
1
a,b,c
2
a,c
I was able to achieve the desired result by creating a query to filter out lines where MADE_PURCHASE = 'NO'. Then I created a list in the report. The first column is CLIENT_ID then I insert a repeater in the second column that contains STORE_ID.
The problem is that the repeater slows my report by a factor about equal to the number of CLIENT_ID retrieved. For example if I run the query without a repeater and it returns 10 unique CLIENT_ID in 10 seconds, then adding the repeater slows the report to 100 seconds. As soon as I enter more than a few hundred CLIENT_ID in the prompt the report takes multiple hours to run.
I tried editing the master-detail relatioship between the list and the repeater without much change. Anyone has any idea how I could make it run faster?
P.S. I know the desired output format is not ideal but the goal is to mimic a legacy report that was built on excel using concatenate on STORE_ID, as such, the client wants to keep the original format.
You can try to edit the FM - Governors with the parameter (DQM) Master-Detail Optimization with "Cache Relaional Detail Query".

How to count active customers using start and end date in Power-BI?

Newbie here to Power-BI. I have a project where I need to count the number of active customers per month based on the Start_date and exit_date. There are some customers that do not yet have an exit date as they are still active/ haven't yet exited.
If all you want to do is count the customers that have no exit date (active customers), this worked for me:
CountActive = COUNTBLANK('your table'[EXIT_DATE])
If you want to show the active customers based on the start date you can simply use a slicer with START_DATE. The measure will automatically update the count of active customers.
If you want to show the customers that already exited (non active customers) you could do the following:
CountNonBlank = COUNTROWS('your table') - COUNTBLANK('your table'[END_DATE])
Again, just simply add a slicer to show count depending on start date.
Hope this helps

Getting Info from GCP Data Catalog

I notice when you query the data catalog in the Google Cloud Platform it retrieves stats for the amount of times a table has been queried:
Queried (Past 30 days): 5332
This is extremely useful information and I was wondering where this is actually stored and if it can be retrieved for all the tables in a project or a dataset.
I have trawled the data catalog tutorials and written some python scripts but these just retrieve entry names for tables and in an iterator which is not what I am looking for.
Likewise I also cannot see this data in the information schema metadata.
You can retrieve the number of completed/performed queries of any table/dataset exporting log entries to BiqQuery. Every query generates some logging on Stackdriver so you can use advanced filters to select the logs you are interested it and store them as a new table in Bigquery.
However, the retention period for the data access logs in GCP is 30 days, so you can only export the logs in the past 30 days.
For instance, use the following advance filter for getting the logs corresponding to all the jobs completed of an specific table:
resource.type="bigquery_resource" AND
log_name="projects/<project_name>/logs/cloudaudit.googleapis.com%2Fdata_access" AND
proto_payload.method_name="jobservice.jobcompleted"
"<table_name>"
Then select Bigquery as Sink Service and state a name for your sink table and the dataset where it will be stored.
All the completed jobs on this table performed after the sink is established will appear as a new table in BigQuery. You can query this table to get information about the logs (you can use a COUNT statement on any column to get the total number of successful jobs for instance).
This information is available in the projects.locations.entryGroups.entries/get API. It is availble as UsageSignal, and contains usage information of 24 hours, 7days, 30days.
Sample output:
"usageSignal": {
"updateTime": "2021-05-23T06:59:59.971Z",
"usageWithinTimeRange": {
"30D": {
"totalCompletions": 156890,
"totalFailures": 3,
"totalCancellations": 1,
"totalExecutionTimeForCompletionsMillis": 6.973312e+08
},
"7D": {
"totalCompletions": 44318,
"totalFailures": 1,
"totalExecutionTimeForCompletionsMillis": 2.0592365e+08
},
"24H": {
"totalCompletions": 6302,
"totalExecutionTimeForCompletionsMillis": 25763162
}
}
}
Reference:
https://cloud.google.com/data-catalog/docs/reference/rest/v1/projects.locations.entryGroups.entries/get
https://cloud.google.com/data-catalog/docs/reference/rest/v1/projects.locations.entryGroups.entries#UsageSignal
With Python Datacatalog - You first need to search the Data catalog and you will receive linked_resource in response.
Pass this linked_resource as a request to lookup_entry and you will fetch the last queried (30 days)
results = dc_client.search_catalog(request=request, timeout=120.0)
for result in results:
linked_resource = result.linked_resource
# Get the Location and number of times the table is queried in last 30 days
table_entry = dc_client.lookup_entry(request={"linked_resource": linked_resource})
queried_past_30_days = table_entry.usage_signal.usage_within_time_range.get("30D")
if queried_past_30_days is not None:
dc_num_queried_past_30_days = int(queried_past_30_days.total_completions)
else:
dc_num_queried_past_30_days = 0

cflock/cfthread for auction closures

I'm building an auction site and I want to make sure that there are no issues with mutiple auctions closing at similar times. I have a basic understand of cflock and was wondering how this, or others like cfthread, could be applied optimally to protect this process from race conditions.
So far, my processing goes like this:
When an auction reaches its closing time, I initially set the status of "1" to indicate that the auction is closing. Site users then see the message "Closing" while the closure process takes place.
close_auction.cfm
<cfset resUpdateStatus = oListing.updateStatus(listing_id=url.listing_id, status=1)>
Then the call to the closeAuction() function:
<cfset resCloseAuction = oListing.closeAuction(listing_id=url.listing_id)>
listing.cfc
The closeAuction() function, with a cflock I'm currently using:
<cflock timeout="30" name="closeAuction_#arguments.listing_id#" type="exclusive">
<cftransaction>
<!--- if for whatever reason auction closure time not yet reached, delay a few seconds --->
<cfloop condition = "now() lt qListing.end_date">
<cfset oSystem.pause(TimeDelay=5)>
</cfloop>
<!--- processing here --->
- select on listing table to ensure listing still exists
- select on bids table - check high bid, determine if winner
- update listing table - status, winner_id, close_date etc
- update user_subscription table
- delete scheduled task that auto-closes the auction
- update tracking table
- delete listing from users' watchlists
</cftransaction>
</cflock>
This works fine while testing single auctions, but I'm concerned about what could happen under heavy load with many auctions closing potentially within seconds of each other. Any advice bullet-proofing this would be appreciated.
I'm on CF12
UPDATE
Updated to show processing inside cftransaction