With Facebook deprecating its REST API, what happens to affiliations? [duplicate] - facebook-graph-api

This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
How to get user’s network information using Facebook Graph API? (PHP)
How can I get which college or high school network a Facebook user belongs to? Before there was the affiliations element, but it's not there in the new Graph API.

There is no way to get this information using the Graph API. The user object of the graph API contains keys for work and education, but these are distinct from the user's network affiliations, which cannot be accessed via the Graph API.
What you need to do instead is use FQL. You can get the user's network affiliations using the affiliations column of the user table. If that's really what you want to do, then that's all you need to know.
However, if (as seems likely) you're not strictly interested in the user's network affiliations, but are instead more broadly interested in their education history, beware! As mentioned, affiliations are distinct from work and education in Facebook's data. There is no guarantee that a user who has entered a particular High School or University into the 'Education' section of their profile has also joined the network for that school or university... but they may have. Worse, if someone has, for example, both added a university to their education section and also joined that university's network, the university may appear under different names in the 'network' column and the 'education' column. This means that getting a complete list without duplicates of the schools or universities somebody has attended is non-trivial and will require some clever hacks on your end.
For example, when selecting affiliations and education from the user table, one of my friends shows up like this:
"affiliations": [
{
"nid": 16777585,
"name": "Oxford University",
"type": "college"
},
{
"nid": 33585181,
"name": "Queens College, Taunton",
"type": "high school"
}
],
"education": [
{
"school": {
"id": 110611448960664,
"name": "Queens College, Taunton"
},
"type": "High School"
},
{
"school": {
"id": 16686610106,
"name": "University of Oxford"
},
"type": "College"
}
]
Note that her education section includes 'University of Oxford', but in her affiliations, it's 'Oxford University'. If you want to be able to pull a user's education history cleanly, you'll need to do something clever to identify that these are duplicates on your end.
To get an idea of the mess you face, run this query to see the 'education' and 'affiliations' fields of all your friends.
Good luck!

Related

Read data iteratively from DynamoDB inside a Step Functions workflow

I am new to coding and learning about AWS services. I am trying to fetch data iteratively from a DynamoDB table in a Step Functions workflow. For Example, say for a book ID I want to check the names of customers that purchased the book. I am primarily trying to do practice using Map state and try to integrate with DDB.
I created a DDB table with id as partition key and added cust_name1 and cust_name2 as attributes for multiple customers of a book. Now in my Step Functions, I want to use a Map state to query how many people have that book ID. Is this possible? Or is there a better way to use Map state for this scenario?
I am able to do this in a Task state, but trying to figure out how to use Map state for this scenario.
{
"Comment": "A description of my state machine",
"StartAt": "DynamoDB Get Book Purchases",
"States": {
"DynamoDB Get Book Purchases": {
"Type": "Task",
"Resource": "arn:aws:states:::dynamodb:getItem",
"Parameters": {
"TableName": "get-book-data",
"Key": {
"id": {
"S": "121-11-2436"
}
}
},
"ResultPath": "$.DynamoDB",
"End": true
}
}
}
Map (in Step Functions) is designed to apply a function on each of the elements of an array or list. Count is an aggregation function and therefore, using it as part of Map is not that useful.
For your exercise, you can try to allow users to buy multiple books, where the Map functions will check the availability of each of the books in your inventory.

DynamoDB one-to-one

Hello stackoverflow community,
This question is about modeling one-to-one relationships with multiple entities involved.
Say we have an application about students. Each Student has:
Profile (name, birth date...)
Grades (math score, geography...)
Address (city, street...).
Requirements:
The Profile, Grades and the Address only belong to one Student each time (i.e. one-to-one).
A Student has to have all Profile, Grades and Address data present (there is no student without grades for example).
Updates can happen to all fields, but the profile data mostly remain untouched.
We access the data based on a Student and not by querying for the address or something else (a query could be "give me the grades of student John", or "give me profile and address of student John", etc).
All fields put together are bellow the 400kb threshold of DynamoDB.
The question is how would you design it? Put all data as a single row/item or split it to Profile, Grades and Address items?
My solution is to go with keeping all data in one row defined by the studentId as the PK and the rest of the data follow in a big set of columns. So one item looks like [studentId, name, birthDate, mathsGrade, geographyGrade, ..., city, street].
I find that like this I can have transnational inserts/updates (with the downside that I always have to work with the full item of course) and while querying I can ask for the subset of data needed each time.
On top of the above, this solution fits with two of the most important AWS guidelines about dynamo:
keep everything in a single table and
pre-join data whenever possible.
The reason for my question is that I could only find one topic in stackoverflow about one-to-one modeling in DynamoDB and the suggested solution (also heavily up-voted) was in favor of keeping the data in separate tables, something that reminds me a relational-DB kind of design (see the solution here).
I understand that in that context the author tried to keep a more generic use case and probably support more complex queries, but it feels like the option of putting everything together was fully devalued.
For that reason I'd like to open that discussion here and listen to other opinions.
A Basic Implementation
Considering the data and access patterns you've described, I would set up a single student-data table with a partition key that allows me to query by the student, and a sort key that allows me to narrow down my results even further based on the entity I want to access. One way of doing that would be to use some kind of identifier for a student, say studentID, and then something more generalized for the sort key like entityID, or simply SK.
At the application layer, I would classify each Item under one possible entity (profile, grades, address) and store data relevant to that entity in any number of attributes that I would need on that Item.
An example of how that data might look for a student named john smith:
{ studentId: "john", entityId: "profile", firstName: "john", lastName: "smith" }
{ studentId: "john", entityId: "grades", math2045: 96.52, eng1021:89.93 }
{ studentId: "john", entityId: "address", state: "CA", city: "fresno" }
With this schema, all your access patterns are available:
"give me the math grades of student john"
PartitionKey = "john", SortKey = "grades"
and if you store address within the students profile entity, you can accomplish "give me profile and address of student John" in one shot (multiple queries should be avoided when possible)
PartitionKey = "john", SortKey = "profile"
Consider
Keep in mind, you need to take into account how frequently you are reading/writing data when designing your table. This is a very rudimentary design, and may need tweaking to ensure that you're not setting yourself up for major cost or performance issues down the road.
The basic idea that this implementation demonstrates is that denormalizing your data (in this case, across the different entities you've established) can be a very powerful way to leverage DynamoDB's speed, and also leave yourself with plenty of ways to access your data efficiently.
Problems & Limitations
Specific to your application, there is one potential problem that stands out, which is that it seems very feasible the grades Items start to balloon to the point where they are impossible to manage and become expensive to read/write/update. As you start storing more and more students, and each student takes more and more courses, your grades entities will expand with them. Say the average student takes anywhere from 35-40 classes and gets a grade for each of them, you don't want to have to manage 35-40 attributes on an item if you don't have to. You also may not want back every single grade every time you ask for a student's grades. Maybe you start storing more data on each grade entity like:
{ math1024Grade: 100, math1024Instructor: "Dr. Jane Doe", math1024Credits: 4 }
Now for each class, you're storing at least 2 extra attributes. That Item with 35-40 attributes just jumped up to 105-120 attributes.
On top of performance and cost issues, your access patterns could start to evolve and become more demanding. You may only want grades from the student's major, or a certain type of class like humanities, sciences, etc, which is currently unavailable. You will only ever be able to get every single grade from each student. You can apply a FilterExpression to your request and remove some of the unwanted Items, but you're still paying for all the data you've read.
With the current solution, we are leaving a lot on the table in terms of optimizations in performance, flexibility, maintainability, and cost.
Optimizations
One way to address the lack of flexibility in your queries, and possible bloating of grades entities, is the concept of a composite sort key.
Using a composite sort key can help you break down your entities even further, making them more manageable to update and providing you more flexibility when you're querying. Additionally, you would wind up with much smaller and more manageable items, and although the number of items you store would increase, you'll save on cost and performance. With more optimized queries, you'll get only the data you need back so you're not paying those extra read units for data you're throwing away. The amount of data a single Query request can return is limited as well, so you may cut down on the amount of roundtrips you are making.
That composite sort key could look something like this, for grades:
{ studentId: "john", entityId: "grades#MATH", math2045: 96.52, math3082:91.34 }
{ studentId: "john", entityId: "grades#ENG", eng1021:89.93, eng2203:93.03 }
Now, you get the ability to say "give me all of John's MATH course grades" while still being able to get all the grades (by using the begins_with operation on the sort key when querying).
If you think you'll want to start storing more course information under grades entities, you can suffix your composite sort key with the course name, number, identifier, etc. Now you can get all of a students grades, all of a students grades within a subject, and all that data about a students grade within a subject, like its instructor, credits, year taken, semester, start date, etc.
These optimizations are all possible solutions, but may not fit your application, so again keep that in mind.
Resources
Here are some resources that should help you come up with your own solution, or ways to tweak the ones I've provided above to better suit you.
AWS re:Invent 2019: Data modeling with Amazon DynamoDB (CMY304)
AWS re:Invent 2018: Amazon DynamoDB Deep Dive: Advanced Design Patterns for DynamoDB (DAT401)
Best Practices for Using Sort Keys to Organize Data
NoSQL Design For DynamoDB
And keep this one in mind especially when you are considering cost/performance implications for high-traffic application:
Best Practices for Designing and Using Partition Keys Effectively

AWS Personalize: how to deal with a huge catalog with not enough interaction data

I'm adding a product recommendation feature with Amazon Personalize to an e-commerce website. We currently have a huge product catalog with millions of items. We want to be able to use Amazon Personalize on our item details page to recommend other relevant items to the current item.
Now as you may be aware, Amazon Personalize heavily rely on the user interaction to provide recommendation. However, since we only just started our new line of business, we're not getting enough interaction data. The majority of items in our catalog have no interaction at all. A few items (thousands) though get interacted a lot, which then impose a huge influence on the recommendation results. Hence you will see those few items always get recommended even if they are not relevant to the current item at all, creating a very odd recommendation.
I think this is what we usually refer as a "cold-start" situation - except that usual cold-start problems are about item "cold-start" or user "cold-start", but the problem I am faced with now is a new business "cold-start" - we don't have the basic amount of interaction data to support the a fully personalized recommendation. With the absence of interaction data of each item, we want the Amazon Personalize service to rely on the item metadata to provide the recommendation. So that ideally, we want the service to recommend based on item metadata and once it's getting more interactions, recommend based on item metadata + interaction.
So far I've done quite some researches only to find one solution - to increase explorationWeight when creating the campaign. As this article indicates, Higher values for explorationWeight signify higher exploration; new items with low impressions are more likely to be recommended. But it does NOT seem to do the trick for me. It improves the situation a little bit but still often times I am seeing odd results being recommended due to a higher integration rate.
I'm not sure if there're any other solutions out there to remedy my situation. How can I improve the recommendation results when I have a huge catalog with not enough interaction data?
I appreciate if anyone has any advice. Thank you and have a good day!
The SIMS recipe is typically what is used on product detail pages to recommend similar items. However, given that SIMS only considers the user-item interactions dataset and you have very little interaction data, SIMS will not perform well in this case. At least at this time. Once you have accumulated more interaction data, you may want to revisit SIMS for your detail page.
The user-personalization recipe is a better match here since it uses item metadata to recommend cold items that the user may be interested in. You can improve the relevance of recommendations based on item metadata by adding textual data to your items dataset. This is a new Personalize feature (see blog post for details). Just add your product descriptions to your items dataset as a textual field as shown below and create a solution with the user-personalization recipe.
{
"type": "record",
"name": "Items",
"namespace": "com.amazonaws.personalize.schema",
"fields": [
{
"name": "ITEM_ID",
"type": "string"
},
{
"name": "BRAND",
"type": [
"null",
"string"
],
"categorical": true
},
{
"name": "PRICE",
"type": "float"
},
{
"name": "DESCRIPTION",
"type": [
"null",
"string"
],
"textual": true
},
],
"version": "1.0"
}
If you're still using this recipe on your product detail page, you can also consider using a filter when calling GetRecommendations to limit recommendations to the current product's category.
INCLUDE ItemID WHERE Items.CATEGORY IN ($CATEGORY)
Where $CATEGORY is the current product's category. This may require some experimentation to see if it fits with your UX and catalog.

AWS ASM describe-secret details and date resolution

I am trying to retrieve the secret details using aws CLI command, and I am able to get the details as well. But I am not able to understand the format in which dates are being retured.
{
"RotationRules": {
"AutomaticallyAfterDays": 90
},
"Name": "indextract/uat",
"VersionIdsToStages": {
"51b23a11-b871-40ec-a5f0-d4d3c90d781e": [
"AWSCURRENT"
],
"1f0b581b-4353-43d0-9163-f3a8a622b479": [
"AWSPREVIOUS"
]
},
"Tags": [],
"RotationEnabled": true,
"LastChangedDate": 1596000798.137,
"LastRotatedDate": 1595914104.584,
"KmsKeyId": "XXX",
"RotationLambdaARN": "XXX",
"LastAccessedDate": 1595980800.0,
"ARN": "XXX",
"Description": "ZZZZ"
}
Can someone please help in interpreting LastRotatedDate, is there a cast function which I can use directly or on the field after parsing json?
May be a python or a unix command?
As a second part of question, my requirement is to get the new password only if it has changed. One way is to make first api call to get the LastChangeDate and then make get-secret-value call if required as per rotation days.
But this would need 2 api call, is there a way to do this in a single call? May be passing an argument like date and get response only if LastChangedDate is beyond the passed argument?
I could not find a way in doc, so though to take some suggestions.
LastChangedDate, LastRotatedDate and LastAccessedDate are all timestamp fields described here
In another word, they are epoch ones. I have tried using the conversion with your timestamp and the data is showing as
For the second question, I have not found direct way as you expected, but you can handle it based on some logic like.
Keep track 'timestamp' of the date you know the key get rotated, maybe using DynamoDB here to store it.
Based on the flag there you will decide your logic to see if you need to read a new password or not.
You can use Lambda, and CloudWatch rules to handle trigger and small logic for it.
But again, it takes efforts for that logic as well.
Could I ask you is that because of performance issue we want to reduce calling API everytime?
Thanks,

Finding the best way to put a tiered document into DynamoDB

I've been working with regular SQL databases and now wanted to start a new project using AWS services. I want the back end data storage to be DynamoDB and what I want to store is a tiered document, like an instruction booklet of all the programming tips I learned that can be pulled and called up via a React frontend.
So the data will be in a format like Python -> Classes -> General -> "Information on Classes Text Wall"
There will be more than one sub directory at times.
Future plans would be to be able to add new subfolders, move data to different folders, "thumbs up", and eventual multi account with read access to each other's data.
I know how to do this in a SQL DB, but have never used a NoSQL before and figured this would be a great starting spot.
I am also thinking of how to sort the partition, and I doubt this side program would ever grow to more than one cluster but I know with NoSQL you have to plan your layout ahead of time.
If NoSQL is just a horrible fit for this style of data, let me know as well. This is mostly for practice and to practice AWS systems.
DynamoDb is a key-value database with options to add a secondary indices. It's good to store documents that doesn't require full scan or aggregation queries. If you design your tiered document application to show only one document at a time, then DynamoDB would be a good choice. You can put the documents with a structure like this:
DocumentTable:
{
"title": "Python",
"parent_document": "root"
"child_documents": ["Classes", "Built In", ...]
"content": "text"
}
Where:
parent_document - the "title" of the parent document, may be empty for "Python" in your example for a document titled "Classes"
content - text or unstructured document with notes, thumbs up, etc, but you don't plan to execute conditional queries over it otherwise you need a global secondary index. But as you won't have many documents, full scan of a table won't take long.
You can also have another table with a table of contents for a user's tiered document, which you can use for easier navigate over the documents, however in this case you need to care about consistency of this table.
Example:
ContentsTable:
{
"user": -- primary key for this table in case you have many users
"root": [
"Python":[
"Classes": [
"General": [
"Information on Classes Text Wall"
]
]
]
]
}
Where Python, Classes, General and Information on Classes Text Wall are keys for DocumentTable.title. You can also use something instead of titles to keep the keys unique. DynamoDB maximum document size is 400 KB, so this would be enough for a pretty large table of contents