Suppress data which appears in different groups, and keep the one with the latest date - grouping

My report needs to group and count a set of data, when the data appears in different groups with the same ID and TYPE but different DATE and DECISION, it requires to suppress the data that is not with the latest date and the total count should not include the suppressed data. Can you please help me with this?
Raw data
ID TYPE DATE DECISION
1111 F 12/01/2016 Approved
1122 E 3/02/2016 Approved
1111 F 23/01/2016 Refused
1133 G 3/07/2016 Refused
Before grouping, I am able to suppress the first record which is not with the latest date:
ID TYPE DATE DECISION
1122 E 3/02/2016 Approved
1111 F 23/01/2016 Refused
1133 G 3/07/2016 Refused
After I group the data by DECISION:
Group 1 - Approved
ID TYPE DATE DECISION
1111 F 12/01/2016 Approved
1122 E 3/02/2016 Approved
Group 2 - Refused
ID TYPE DATE
1111 F 23/01/2016 Refused
1133 G 3/07/2016 Refused
Total Count of ID: 4
Expected Result:
Group 1 - Approved
ID TYPE DATE
1122 E 3/02/2016
Group 2 - Refused
ID TYPE DATE
1111 F 23/01/2016
1133 G 3/07/2016
Total Count of ID: 3

There are a few options - If you do not need Crystal to retrieve all of the records, the best option (from a performance standpoint) is to use a custom SQL command.
Suppress Nonadjacent Duplicates in Report
If that is not an option, or you need crystal to have all of the records (For a running total of ALL requests made in the system), you can use conditional suppression, but you won't be able to accomplish it by retaining your current grouping.
See this post for configuring a conditional detail suppression. Crystal Reports group sorting
The conditional detail suppression works by adding a number to each record within a group. You can determine which record is number "1" by using Record Sort expert, and then suppressing the details if the record number is greater than 1.
This approach won't work if you have the report grouped by decision first, because the ID is essentially a subset of that decision (thus, 1111 would appear in both decision groups).
If the objective of this report is to get aggregate data, this approach will be fine because you can create running totals which count records when a record is "approved" or refused, even without utilizing any grouping on Decision
EDIT: This running total will count ALL of the records (ID 1111 would be counted twice). The SQL command is the cleanest way to get you what you need... Another option might be to use a variable. I will research.

Related

Regex to reduce comma separated category ids to top level id

Very new to regex so question one would be is this possible?
I have products that can be in multiple categories/ subcategories, but for reporting, I just want to attribute them once per top category.
Original data:
1010,1012,1012610,1014243,10147048956,2010,201150205,2011506,2015470
Desired Result:
1010,1012,1014,2010,2011,2015
Details
1010 is unchanged
1012,1012610 reduce to 1 instance of 1012
1014243,10147048956 reduce to 1 instance of 1014
2010 is unchanged
201150205,2011506 reduce to 1 instance of 2011
2015470 is reduced to 2015
My current pattern (?|(10..)|(20..)) works well with exception to the following bold sections:
1010,1012,1012610,1014243,10147048956,2010,201150205,2011506,2015470
As for reducing, I am at a loss for where to start.
Thank you in advance for any assistance or direction.
\b(\w{4})
1010,1012,1012610,1014243,10147048956,2010,201150205,2011506,2015470
after applying regex "\b(\w{4})" can you collect values in Set it will make those element unique.

Repeated events in calendar

First of all, please excuse my perfectible English, I am French.
I'm coming to ask for some advice on a database design problem.
I have to design a calendar with events. Briefly, an event includes a start date/time, an end date/time, and a description.
The problem is that I have to consider repetitions; it is possible when creating an event to indicate that it starts next week and repeats itself until a specific date or not.
I see two possibilities of design:
create an events table with id, start_datetime, end_datetime and description fields.
When adding a new event, we generate as many rows as there are repeated events.
Advantages: we can make a SELECT * to retrieve all the events, without particular algorithm. In addition, it is possible to modify the descriptions of each occurrence of an event, insofar as they are considered as all different.
Disadvantage (MAJOR!): If we do not put an end date to have an infinite repetition, we will not memorize an infinity of events...
take inspiration from the method described on this thread, that is to say two tables:
events table
id description
1 Single event on 2018-11-23 08:00-09:30
2 Repeated event :
* every monday from 10:00 to 12:00 from Monday 2018-11-26
* every wednesday from 2018-11-28 from 14:00 to 14:45 until 2019-02-27
event_repetitions table
id event_id start_datetime end_datetime interval end_date
1 1 2018-11-23 08:00:00 2018-11-23 09:30:00 NULL NULL
2 2 2018-11-26 10:00:00 2018-11-26 12:00:00 604800 NULL
3 2 2018-11-28 14:00:00 2018-11-28 14:45:00 604800 2019-02-27
Note: interval is the number of seconds between each occurrence, 604800 = 24 (hours) * 3600 (seconds) * 7 (days).
Advantage: In the case of infinite repetitions (case of the event of id 2), we have very few lines to write and performances are increased.
Disadvantages: if we want to modify the description of the event (or other possible fields) for a specific occurrence and not another, we can not without creating a third table, event_descriptions for example:
id event_id user_id datetime description
1 2 1 2018-11-26 10:00:00 Comment from 2018-11-26
2 2 2 2018-12-03 10:00:00 Comment of the second occurrence, i.e. from 2018-12-03
Note: user_id is the logged-in user who wrote the comment.
Another disadvantage is that to get the list of events for a given day, week, or month, the selection query will be more complex and use joins. The event_descriptions table may, when there are hundreds of thousands of events, be very big.
My question is: what would you recommend as a more effective alternative? Maybe the second solution is good? What do you think?
In terms of technologies used, I intend to go on MySQL, the DBMS I know best. Nevertheless, if you think that using for example MongoDB is better in case of very large numbers of lines, do not hesitate to report it.
For information, my application is an API developed with API Platform, so Symfony 4 with Doctrine ORM.
Thank you in advance for your answers.
I allow myself to do a little up, hoping other answers.

Excel IF function and in between values, but only if

I have values for postage, pricing and postage service (only if). I have two choices for postage service (express and eco), price depends on a weight, but service depends on a price (fast service for items over £5, eco - under).
Service: if product price(A2)
<5=eco; >5=express
Service price(C2) by weight(B2):
<=1000gr= £2 eco or £3 express
1001-1250gr= £5 eco or £6 express
1251-5000gr=£9 eco or £11 express
Cells A2 and B2 always display a value, need a formula for C2 to display the price of service calculated by weight, but if item over £5 must display express service price if less - eco.
I have tried:
>IF(AND(OR(B2<=1000),A2<5),2,IF(AND(OR(B2>1000,B2<=1250),A2<5),5,IF(AND(OR(B2>1250,B2<=5000),A2<5),9)))
>IF(AND(OR(B2<=1000),A2<5),2)+IF(AND(OR(B2>=1001,B2<=1250),A2<5),5)+IF(AND(OR(B2>2000),A2<5),9)
Didn't start adding A2>5, because nothing works anyway! Tried many more, but no luck.
Would appreciate any help because stuck and ran out of options :(
Thanks!
There are a couple of ways to accomplish this. The preferred method is to build a small cross-reference table for your surcharges and use the VLOOKUP function to return the values.
However, this question was about hard-coded values in a conditional statement, so I will address that with a LOOKUP function and arrayed constants.
The standard formula in C2 is,
=LOOKUP(B2,{0,1001,1251},{2,5,9})+SIGN(A2)*LOOKUP(B2,{0,1001,1251},{1,1,2})
Fill down as necessary.
In the following image, custom number formats were used on columns A and B ([Color9]\Exp\r\e\s\s - [$£-809]#,##0.00;;[Color10]\Eco - [$£-809]#,##0.00; and 0\g\r_)). Weights >5000 in column B trigger a conditional formatting in column C that displays too heavy.
    

Issue with ms access 2000, repeating display of same field in query

I was having an issue with ms access 2000 in which I try to enter the same field in a query multiple times and it only displays the field once. As in if I entered the field with the number being (for example) 8150 multiple times, it would only display it once.
This image shows the query.
I've already checked everything on ms access 2000 to try to resolve this issue but I've come up with nothing suitable.
I know your data set is simplified, but looking at your data, inputs, etc, it appears your query is pulling from a single table and repeating results -- so there is no join consideration.
I think the issue is your DISTINCTROW in the query, which is removing all duplicate values.
If you remove the "DISTINCTROW," I believe it may give you what you are expecting. In other words, change this:
SELECT DISTINCTROW Ring.[Ring Number], Ring.[Mounting Weight]
FROM Ring
To this:
SELECT Ring.[Ring Number], Ring.[Mounting Weight]
FROM Ring
For what it's worth, there may also be some strategies to simplifying how this query is run in the future (less dependence on dialog box prompts), but I know you probably want to address the issue at a hand first, so let me know if this doesn't do it.
-- EDIT --
The removal of distinct still applies, but I suddenly see the problem. The query is depicting the logic as "OR" of multiple values. Therefore, repeating the value does not mean multiple rows, it just means you've repeated a true condition.
For example, if I have:
Fruit Count
------ ------
Apple 1
Pear 1
Kiwi 3
and I say select where Fruit is Apple or Apple or Apple or Apple, the query is still only going to list the first row. Once the "Or" condition matches true, short-circuiting kicks in, and no other conditions matter.
That does not sound like what you want.
Here's what I think you need to do:
Get rid of the prompts within the query
Load your options into a separate table -- the repetition can occur here
Change your query to perform an inner join on the new table
New table (named "Selection" for the sake of example):
Entry Ring Number Mounting Weight
----- ----------- ----------------
1 8105 you get the idea...
2 8110
3 8110
4 8110
5 8115
6 8130
7 8130
8 8130
9 8130
10 8150
New Query:
select
Ring.[Ring Number], Ring.[Mounting Weight]
from
Ring
Inner join Selection on Ring.[Ring Number] = Selection.[Ring Number]
This has the added advantage of allowing more (or less) than 10 records

Using Index and Match functions to return a value from multiple worksheets in a workbook

I have a url report that gets generated on a running weekly basis. Each week the report generates a new worksheet within a workbook that keeps around 6 months worth of data at a time. I want to find and pull the data on a specific url from the worksheets and display them in a new worksheet.
For example data in a worksheet might look like:
Week of Mar 9
URL | Visits | Conversions
mysite.com/apple | 300 | 10
mysite.com/banana | 100 | 20
mysite.com/pear | 600 | 5
And each worksheet in the workbook is a different week, such as Mar 2, Feb 23, etc.
Now, I want every Apple url in one worksheet so that I can compare...Apples to Apples...(pun intended). Since there are hundreds of these I can't afford the time to manually do this for each segment I need, so I tried the following.
=INDEX('312015'!8:999,MATCH("apple",'312015'!8:999,-1))
I'm uncertain of which switch to use for Match, other than 0 is "exact match" from what I read online, so I tried both 1 and -1 to get a not-exact match, though reality is I probably need a partial-match since apple is only part of the url.
Any suggestions on how to get this to work or a better way to do this in excel would be great. Also, I can not manipulate the report output themselves as it comes from a third party vendor and I've already asked them about adjusting this.
I thought about using vlookup as well, but I believe that only returns the first result with that value and not multiple ones.
Assuming URL in ColumnA, Visits in ColumnB, Conversions in ColumnC for your source data, and in another sheet your page (fruit: apple/banana/pear) in A1,Visits in B1, Conversions in C1 and sheet names in A2 downwards, then I suggest in B2:
=INDEX(INDIRECT($A2&"!B:B"),MATCH("*"&$A$1,INDIRECT($A2&"!A:A"),0))
in C2:
=INDEX(INDIRECT($A2&"!C:C"),MATCH("*"&$A$1,INDIRECT($A2&"!A:A"),0))
and the two formulae copied down to suit.
This is looking for an exact match but does so with a wildcard.