Inline prompting in a virtual attribute? - universe

I am trying to calculate what the balance of an invoice was at a certain period where
BAL = Invoice.AMT - sum(adj.amts).
The adjustments amounts (adj.amts) are MV and have associated dates.
Therefore, I want to be able to have pass data via prompt for ACCTG.PERIOD.
SUBR('-IFS',SUBR('-LES',ACCTG.PERIOD,REUSE('ADJ.DATES')),'1','')
I also tried
SUBR('-IFS',SUBR('-LES',<<A,ACCTG.PERIOD>>,REUSE('ADJ.DATES')),'1','')
and that did not work.
This cannot be in a paragraph.
This has to be done in an attribute otherwise audit gets all crazy. And I have to use the generic subroutines like IFS and LES.

If I get what you are saying, I think you can use the WHEN statement with SORT to get what you are after provided your dictionary is properly associated or you can figure out how to associate the Multi-values inline.
SORT FILE WHEN ADJ.DATES LT "YOURDATE" AND ADJ.AMTS GT 0 BREAK.ON #ID TOTAL ADJ.AMTS DET.SUP
This example would sort the contents of FILE which assumes that FILE has 2 multivalue fields call ADJ.DATES and ADJ.AMTS and that they are a Date and Decimal type. It then finds all the associations in each record that meet that criteria and totals ADJ.AMTS for each record. The ID.SUP suppress the details.
Good Luck.

Related

AWS Personalize items attributes

I'm trying to implement personalization and having problems with Items schema.
Imagine I'm Amazon, I've products their brands and their categories. In what kind of Items schema should I include this information?
Should I include brand name as string as categorical field? Should I rather include brand ID as string or numeric? or should I include both?
What about categories? I've the same questions.
Metadata Fields Metadata includes string or non-string fields that
aren't required or don't use a reserved keyword. Metadata schemas have
the following restrictions:
Users and Items schemas require at least one metadata field,
Users and Interactions datasets can contain up to five metadata
fields. An Items dataset can contain up to 50 metadata fields.
If you add your own metadata field of type string, it must include the
categorical attribute. Otherwise, Amazon Personalize won't use the
field when training a model.
https://docs.aws.amazon.com/personalize/latest/dg/how-it-works-dataset-schema.html
There are simply 2 ways to include your metadata in Items/Users datasets:
If it can be represented as a number value, then provide the actual value if it makes sense.
If it can be represented as string, then provide the string value and make sure, that categorical is set to true.
But let's take a look into "Why does they need me, to categorize my strings metadata?". The answer is pretty simple.
Let's start with an example.
If you would have Items as Amazon.com products and you would like to provide rates metadata field, then:
You could take all of the rates including the full review text sent by clients and simply put it as metadata field.
You can take just stars rating, calculate the average and put it as metadata field.
Probably the second one is making more sense in general. Having random, long reviews of product as metadata, pretty much changes nothing. Personalize doesn't understands if the review itself is good or bad, or if the author also recommends another product, so pretty much it doesn't really add anything to the recommendations.
However if you simply "cut" your dataset and calculate the average rating, like in the 2. point, then it makes a lot more sense. Maybe some of our customers like crappy products? Maybe they want to buy them, because they are famous YouTubers and they create videos about that? Based on their previous interactions and much more, Personalize will be able to perform just slightly better, because now it knows, that this product has rating of 5/5 or 3/5.
I wanted to show you, that for some cases, providing Items metadata as string makes no sense. That's why your string metadata must be categorical. It means, that it should be finite set of values, so it adds some knowledge for Personalize about given Item and why some of people might want to interact with it.
Going back to your question:
Should I include brand name as string as categorical field? Should I rather include brand ID as string or numeric? or should I include both?
I would simply go with brand ID as string. You could also go with brand name, but probably single brand can be renamed, when it's still the same brand, so picking up the ID would be more constant. Also two different brands could have the same names, because they are present on different markets, so picking up the ID solves that.
The "categorical": true switch in your schema just tells Personalize:
Hey, do you see that string field? It's categorised, finite set of values. If you train a model for me, please include this one during the training, it's important!
And as it's said in documentation, if you will provide string metadata field, which is not marked as categorical, then Personalize will "think" that:
Hmm.. this field is a string, it has pretty random values and it's not marked as categorical. It's probably just a leftover from Items export job. Let's ignore that.

Update multiple field values matching a condition in InfluxDB

In an InfluxDB measurement, how can the field values of points matching a query be updated? Is this still not easily doable as of v1.6?
As the example in that GitHub ticket suggested, what's the cleanest way of achieving something like this?
UPDATE access_log SET username='something' WHERE mac='xxx'
Anything better than driving it all from the client by updating individual points?
Q: How can the field values of points matching a query be updated? Is this still not easily doable as of v1.4?
A: From the best of my knowledge, there isn't an easy way to accomplish update in version 1.4 yet.
Field value of a point can only be updated by overriding. That is, to overwrite its value you'll need to know the details of your points. These details include its timestamp and series information, which is the measurement it reside and its corresponding tags.
Note: This "update" strategy can only be used for changing the field value but not tag value. To update a tag value you'll need to first DELETE the point data first and rewrite the entire point data with the updated tag and value.
Q: Anything better than driving it all from the client by updating individual points?
A: Influxdb supports multi-point write. So if you can build a filter to pre-select a small dataset of points, modify their field values and then override them in bulk.
Update is possible and would take the format:
INSERT measurement,tag_name=tag_value_no_quotes value_key_1=value_value_1,value_key_2=value_value_2 time
for example where I want to update the line with tag my_box at time 1526988768877018669 on the box measurement:
INSERT box,box_name=my_box item_1='apple',item_2='melon' 1526988768877018669

Group by similar words

Is there any way to group a table by a text field, having in count that this text field is not always exactly the same?
Example:
select city_hotel, count(city_hotel)
from hotels, temp_grid
where st_intersects(hotels.geom, temp_grid.geom)
and potential=1
and part=4
group by city_hotel
order by (city_hotel) desc
The output I get is the expected, for example, City name and count:
"Vassiliki ";1
"Vassiliki";1
"Vassilias, Skiathos";1
"Vassilias";5
"Vasilikí";25
"Vasiliki";23
"Vasilias";1
But I'd want to group more this field, and get only one "Vasiliki" (or an array with all, this is not a problem) and a count of all the cells containing something similar between them.
I do not know if could this be possible. Maybe some function to text analysis or something similar?
SELECT COUNT(*), `etc` FROM table GROUP BY textfield LIKE '%sili%';
// The '%' is a SQL wildcard, which matches as many of any character as required.
You could do something like the above, choosing a word for the 'like' that best fits the spellings that your users have used.
Something that can help with that would be to do a
SELECT COUNT(*), textfield FROM table GROUP BY textfield ORDER BY textfield;
And selecting the most 'average' spelling for your words.
Otherwise you're starting to get into a bit of language processing, and for that you will want to write some code outside of SQL.
This would be something like https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance
To find word's that are the same within an arbitrary margin of error.
There is a MySQL implementation here that you should be able to transpose as needed
https://stackoverflow.com/a/6392380/1287480
(credit https://stackoverflow.com/a/3515291/1287480)
.
(Personal thoughts on the topic)
You Really Really want to think about limiting the input from users that can give you this issue in the first place. It's far far better to give the users a list of places to select from, than it is to push potentially 'dirty' information into your database. That eventually always winds up with you trying to clean the information at a later time. A problem that has kept many people employed for many years.

Reading Sharepoint List Calculated field from PowerShell

I am trying to pull data from a SharePoint list. The field is a calculated column that takes a yes or no answer and changes the words to archived and non-archived.
I can see the data being formatted correctly in the calculated column in IE but when I try to pull the data it shows up as nothing when I check the variable data.
$site = get-spsite https://extranet./sites/site
$web = get-spweb -Identity https://extranet/sites/site
$list=$web.getlist("https://extranet/sites/site/lists/List");
$View = $list.Views["LISTVIEW"]
$listitems = $list.Getitems($view)
foreach ($listitem in $listitems) {
I have tried this also but get an indexing a null variable error.
$mailboxdb = $listitem.Fields["mailboxdb"] -as [Microsoft.SharePoint.SPFieldCalculated];
$mailboxdb.GetFieldValueAsText($listitem["mailboxdb"]);
I see this also in the $listitems output. ows_MailboxDb='string;#Archived'
But when I check $mailboxdb its empty.
Found this but I don't know what it means by stored results.
In Powershell, although you can reference any field in the list in your script, you can only compare retrieve values from "static" fields - that is, you cannot use calculation fields. PowerShell will not complain - but you will not get results in your script. This is because the .Net library for Sharepoint will not do the field calculation for you - that only happens inside the Sharepoint UI itself.
If you need to have access to a "calculated" field, you actually need to have two fields - the calculated field (usually hidden) and a "stored result" field, which must be updated from the calculated value in the last step of the "Update" workflow. Then you can use the "stored value" field in PowerShell - and also, incidentally, in View calculations in Sharepoint.
You basically have two options here. You can have Powershell do the calculation for you, which is probably the simpler of the two options given the basic nature of your calculation.
The second option, as mentioned at the end of your post is to create a new field which can store the result of the calculation. In your case, you could call it status. Then you would create a workflow that runs whenever a list item is updated or created that stores the results of the calculated field in the results field. This seems redundant to me if you have this field for no other reason than to use the value of the calculated field in a PowerShell script.

Custom Date Aggregate Function

I want to sort my Store models by their opening times. Store models contains is_open function which controls Store's opening time ranges and produces a boolean if it's open or not. The problem is I don't want to sort my queryset manually because of efficiency problem. I thought if I write a custom annotate function then I can filter the query more efficiently.
So I googled and found that I can extend Django's aggregate class. From what I understood, I have to use pre-defined sql functions like MAX, AVG etc. The thing is I want to check that today's date is in a given list of time intervals. So anyone can help me that which sql name should I use ?
Edit
I'd like to put the code here but it's really a spaghetti one. One pages long code only generates time intervals and checks the suitable one.
I want to avoid :
alg= lambda r: (not (s.is_open() and s.reachable))
sorted(stores,key=alg)
and replace with :
Store.objects.annotate(is_open = CheckOpen(datetime.today())).order_by('is_open')
But I'm totally lost at how to write CheckOpen...
have a look at the docs for extra