How do you give negative feedback to amazon personalize? i.e. tell personalize that a user doesn't really like a certain item? - amazon-web-services

I did the amazon personalize deep dive series on youtube. At the timestamp 8:33 in the video, it was mentioned that 'Personalize does not understand negative feedback.' and that any interaction you submit is assumed to be a positive one.
But I think that giving negative feedback could improve the recommendations that we give on a whole. Personalize knowing that a user does not like a given item 'A' would help ensure that it does not recommend items similar to 'A' in the future.
Is there any way in which we can give negative feedback(ex. user doesn't like items x,y,z) to amazon personalize?
A possible way to give negative feedback that I thought of:
Let's say users can give ratings out of 5 for movies. Every time a user gives a rating >= 3 in the interactions dataset, we add an additional interaction in the dataset (i.e we have two interactions saying that a user rated a movie >=3 in the interactions.csv instead of just one). However, if he gives a rating <=2 (meaning he probably doesn't like the movie), we just keep the single interaction of that in the interactions dataset (i.e. we only have one interaction saying that a user rated a movie <=2 in the interactions.csv file)
Would this in any way help to convey to personalize that ratings <=2
are not as important/that the user did not like them?

Negative feedback, where the user explicitly indicates that they dislike an item, is currently not supported as training input for Amazon Personalize. Furthermore, there is currently no way to add weight/reward to specific interactions by event type or event value (see this answer for details).
With that said, you can use impressions with your interactions to indicate items that were seen by the user but that they chose not to interact with. Impressions are only supported by the User-Personalization recipe. From the docs:
Unlike other recipes, which solely use positive interactions (clicking, watching, or purchasing), the User-Personalization recipe can also use impressions data. Impressions are lists of items that were visible to a user when they interacted with (clicked, watched, purchased, and so on) a particular item.
Using this information, a solution created with the User-Personalization recipe can calculate the suitability of new items based on how frequently an item has been ignored, and change recommendations accordingly. For more information see Impressions data.
Impressions are not the same as an explicitly negative interaction but they do imply that the impressed items were considered less relevant/important to the user than the item that they chose to interact with.
Another approach that can be used to consider negative interactions when making recommendations is to have two event types: one event type for positive intent (e.g., "like", "watch", or "purchase") and one event type for dislike (e.g., "dislike"). Then create a Personalize solution that only trains on the positive event type. Finally, at inference time, use a Personalize filter to exclude items that the user has recently disliked.
EXCLUDE ItemID WHERE Interactions.EVENT_TYPE IN ("dislike")
In this case, Personalize is still training only on positive interactions but with the filter you won't recommend items that the user disliked.

Related

Amazon Personalize different Event Types have different Importance

I'm developing a recommendation engine using Amazon Personalize, and found that in interaction dataset, we can input different EVENT_TYPE and corresponding EVENT_VALUE.
If I build the model with two event types (like purchase & click), can I say I can make the model training understand that purchase event is more important(indicate stronger interaction) than click event by setting EVENT_VALUE of purchase to 10, and EVENT_VALUE of click to 3 in the interaction dataset, and perform the model training that way?
Short answer
No - Personalize don't care about EVENT_VALUE for calculating recommendations.
The use of event value
In general, Personalize doesn't include event value during training of the model. It's simply ignored.
However you can use it for implementing your own logic. For example, you can provide Event value threshold during the Solution creation:
This value threshold will be used to determine, if given interaction should be ignored, during Solution training. For example, if event value is percentage progress of watching a video, then having a threshold of 0.9 will make sure, than interactions included during the training, were about fully watching the video.
The use of event type
As you can see on the picture above, you can specify the event type itself, so the given solution will ignore all of the interactions, that doesn't match event type. It might be helpful in some cases.
Event type can be also used in Filters option, which was added a few months ago. It might be helpful to filter out the Items, that User already fully watched or bought, examples:
EXCLUDE itemId WHERE INTERACTIONS.event_type in ("fully_watched")
EXCLUDE itemId WHERE INTERACTIONS.event_type in ("purchased")

How can I use Amazon Personalize to predict user affinity for an item based on taxonomy?

I work at a publishing site. I'm interested in developing a model that can predict a user's affinity for a piece or set of content based on the content they have previously engaged with.
Content is classified via categories and tags. Engagement per item could be binary (clicked on) or a 0-1 float value (normalized length of time engaged).
How should I train a model will allow me to personalize effectively per user?
I don't need realtime access to recommendations. Ideally I would retrain the model weekly with new clickstream data, and batch download data describing each user's top categories and tags with an affinity score.
Thanks.
Working backwards from your use case, the user-personalization recipe is where you should start. This recipe is designed to recommend items (content in your case) to users based on their previous interactions with items/content.
The primary input into this recipe (and all Personalize recipes for that matter) is interactions/events. For you this would be the clicks/views of content. If you have historical interactions of these clicks, you can prepare a CSV with this data. The minimum required fields are USER_ID, ITEM_ID, and TIMESTAMP where each row represents a moment in time when a specific user interacted with an item. You can optionally include an EVENT_TYPE column and EVENT_VALUE column. The values for EVENT_TYPE depends on your application and event taxonomy. If you're just tracking clicks right now, you can use click or view as the event type and then add support for more event types in the future (e.g. bookmark, favorite, etc) as needed. For EVENT_VALUE (type float), you could use your normalized length of time engaged. You can use the EVENT_VALUE to filter which events are included in training by specifying an eventType and eventValueThreshold when creating your solution. For example, if you consider any values equal to or greater than, say, 0.4 to indicate positive interest by a user in a piece of content, you can set a eventValueThreshold of 0.4 and Personalize will only include interactions equal to or above that value in training. Personalize will also include the event value as a feature in the model but it won't be used to weight or reward interactions based this value.
The user-personalization recipe will also consider the items and users datasets, if provided. For your use case, providing an items dataset is where you'd specify the categories and tags for each piece of content (item). You can also include the raw text for each piece of content as a textual field in your items dataset. Personalize will automatically extract features from your textual field to improve the relevance of recommendations.
Once you have your datasets imported into a dataset group, you can create a solution using the user-personalization recipe and then a solution version (which represents the trained model). To get batch recommendations weekly, you would use a batch inference job each week to generate recommendations for each user. The output of the batch inference job can then be processed to determine the category and tag affinities for each user based on the recommended content.

What is EVENT_TYPE and EVENT_VALUE in Amazon Personalize?

I am creating a recommendation engine using Amazon Personalize. I have to send it following data for it,
USER_ID,ITEM_ID,EVENT_TYPE,EVENT_VALUE,TIMESTAMP
I don't understand what EVENT_TYPE and EVENT_VALUE is in it.
Short explanation
EVENT_TYPE, EVENT_VALUE is optional, if you are just starting with AWS Personalize, you can skip them for now.
EVENT_TYPE is event type of Interaction stored in dataset. Interaction is interaction of User with Item.
EVENT_VALUE is value of event of Interactions.
Maybe example will make it more understandable:
USER_ID - YouTube user ID
ITEM_ID - YouTube video
EVENT_TYPE - video_score, User liked or disliked the Video
EVENT_VALUE - 1 for like and -1 for dislike
TIMESTAMP - When did User watched the Video
Long explanation
Let's start from the beginning, in AWS Personalize you have 3 different types of datasets:
Users
Items
Interactions
The content of datasets depends on your use case, for example, if you want to make video recommendations for user using Video sharing platform, then your datasets will probably contain data looking like this:
Users: USER_ID, USER_NAME, USER_LAST_LOGIN [...] etc.
Items: VIDEO_ID, VIDEO_CATEGORY, VIDEO_VIEWS [...] etc.
Interactions: USER_ID,VIDEO_ID,EVENT_TYPE,EVENT_VALUE,TIMESTAMP
But to make it compatible with AWS Personalize, you should convert properties names to match Personalize requirements:
Users: USER_ID, USER_NAME, USER_LAST_LOGIN [...] etc.
Items: ITEM_ID, ITEM_CATEGORY, ITEM_VIEWS [...] etc.
Interactions: USER_ID,ITEM_ID,EVENT_TYPE,EVENT_VALUE,TIMESTAMP
As you can see, Interactions datasets has the information about:
Who (USER_ID) interacted with..
..what item (ITEM_ID)..
..at which time (TIMESTAMP).
Optionally you can add more information to this Interactions dataset, by providing EVENT_TYPE and EVENT_VALUE. So for example it would be like this:
Who (USER_ID) interacted with..
..what item (ITEM_ID)..
..at which time (TIMESTAMP)..
..what type of interaction it was (EVENT_TYPE)..
..what was the value of interaction (EVENT_VALUE).
In service that serves Video content, EVENT_TYPE could be for example video_view and EVENT_VALUE would be value between 0.0 and 1.0, which will show how much of the Video did User watched. For example, 0.5 would be 50% of the Video.
The EVENT_TYPE and EVENT_VALUE is optional, so you don't have to provide them, however it doesn't affect quality of recommendations. The EVENT_VALUE is only used for configuration of Personalize (more about that later).
Also there is one case, that you should remember about. If you provide only EVENT_TYPE or EVENT_VALUE, AWS Personalize will give you an error, because you need both of them, or none of them (which makes sense, since there is no point in storing event data that has unknown value or type).
EVENT_TYPE doesn't have to be only video_view. It can also have different values, for example if user is going to like the video, your application will save this interaction like this:
EVENT_TYPE = 'like'
EVENT_VALUE = 1
For dislike could be:
EVENT_TYPE = 'like'
EVENT_VALUE = -1
The use of event value
In general, Personalize doesn't include event value during training of the model. It's simply ignored.
However you can use it for implementing your own logic. For example, you can provide Event value threshold during the Solution creation:
This value threshold will be used to determine, if given interaction should be ignored, during Solution training. For example, if event value is percentage progress of watching a video, then having a threshold of 0.9 will make sure, than interactions included during the training, were about fully watching the video.
Also as you can see on the picture above, you can specify the event type itself, so the given solution will ignore all of the interactions, that doesn't match event type. It might be helpful in some cases.
Event type can be also used in Filters option, which was added a few months ago. It might be helpful to filter out the Items, that User already fully watched or bought, examples:
EXCLUDE itemId WHERE INTERACTIONS.event_type in ("fully_watched")
EXCLUDE itemId WHERE INTERACTIONS.event_type in ("purchased")

AWS Data Structure and Stack Suggestion for highly filterable data

Firstly, let me know if I should place this in a different Community. It is programming related but less than I would prefer.
I am creating a mobile app based which I intend to base on AWS App Sync unless I can determine it is a poor fit.
I want to store a fairly large set of data, say a half million records.
From these records, I need to be able to grab all entries based on a tag and page them from the larger set.
An example of this data would be:
{
"name":"Product123",
"tags":[
{
"name":"1880",
"type":"year",
"value":7092
},
{
"name":"f",
"type":"gender",
"value":4120692
}
]
}
Various objects may or may not have a specific tag but may have up to 500 tags or more (the seed of initial data has 130 tags). My filter would ignore them if they did not match but return them if they did.
In reading about Query vs Scan on DyanmoDB, I feel like my current data structure would require mostly scanning and be in-efficient. Efficiency is only a real restriction due to cost.
With cost in mind, I will focus on the cost per user to access this data in filtered sets. Say 100,000 users for now each filtering and paging data many times a day.
Your concept of tags doesn't sound too different from the concept of Cognito User Pools' groups with AppSync (docs) - authentication based on groups will only return items allowed for groups that the user making the request is in. Cognito's default group limit is 25 per user pool, so while convenient out of the box, it wouldn't itself help you much. Instead, it's interesting just because it's similar conceptually, and can give you insight by looking at how it works internally.
If you go into the AppSync console and set up a request mapping template for groups auth, you'll see that it uses a scan and the contains operation. Doing something similar would probably be your best bet here, if you really want to use Dynamo. If you find that prohibitively costly, you could use a Lambda data source, which allows you to use any data store, if you have one in mind that's a little more flexible for this type of action.

SpreeCommerce: Mark order as pending

SpreeCommerce has an order overview, where it's possible to see all the orders and the states. Each day we open the order overview, and find the completed orders, pack them and ship them to the customer.
However, sometimes we don't have the goods for an order in stock, and want to mark the order as "pending", so we don't open the order each day by mistake.
What's the best way in SpreeCommerce to mark an order as "pending", so we only have to check the pending orders, when we get a new shipment of goods from our supplier?
It would be great, if we could use the state property, because SpreeCommerce allows us to filter orders by their state.
Spree supports inventory tracking as described here:
http://guides.spreecommerce.com/developer/inventory.html
This will allow you to flag a shipment as being backordered:
https://github.com/spree/spree/blob/v2.2.1/core/app/models/spree/shipment.rb#L79-L81
if any of its inventory units are backordered. An order is considered backordered if any of its shipments are considered backordered:
https://github.com/spree/spree/blob/master/core/app/models/spree/order.rb#L193-L195
Your best bets for putting an order in to a back ordered state would be:
Turn on inventory tracking in Spree and keep it up to date through synchronization or manual audits
Extend Spree to override what it means for shipment to be considered backordered and allow this to be set and unset by administrators as stock levels change
Which solution you should choose depends a great deal upon the specifics of your store and how you manage inventory. The specifics of your implementation could make either solution very easy, or very difficult.