Dynamodb single table structure for Many to many relation - amazon-web-services

We have two Entity Categories and Users. It is classic many 2 many relations.
Users can be tagged to multiple category
Category can have multiple users
Access patterns
Get list of categories
Get list of users, with categories users belong to
Get single user, with categories single user belong to
Get List of users in specific category
I tried to model using Adjacency pattern
but I have few confusions on how to query for
Users list and also get all categories each users belong to

If you have a PK containing the Category and an SK containing the User to model the users in each category, you can create a Global Secondary Index (GSI) with the PK pointing to the original table‘s SK (User) and the SK pointing to the original table’s PK (Category).
Table
| PK | SK | ...
| C#1 | U#1 | ...
| C#1 | U#2 | ...
| C#2 | U#1 | ...
| C#2 | U#3 | ...
GSI
| Table_SK | Table_PK | ...
| U#1 | C#1 | ...
| U#1 | C#2 | ...
| U#2 | C#1 | ...
| U#3 | C#2 | ...
Now you can query:
All categories including their respective users (scan Table)
All users in a single category (query Table)
All users including their respective categories (scan GSI)
All categories that a single user belongs to (query GSI)
Update: Extended model to include metadata as per comments
Table
| PK | SK | CAT | USR | Metadata
---------------------------------------
| | DATA | | { ...: ... }
| C#1 | U#1 | C#1 | U#1 | { ...: ... } (copied from user record)
| | U#2 | C#1 | U#1 | { ...: ... } (copied from user record)
---------------------------------------
| | DATA | | { ...: ... }
| C#2 | U#1 | C#1 | U#1 | { ...: ... } (copied from user record)
| | U#3 | C#1 | U#1 | { ...: ... } (copied from user record)
---------------------------------------
| U#1 | DATA | | { ...: ... }
---------------------------------------
| U#2 | DATA | | { ...: ... }
---------------------------------------
| U#3 | DATA | | { ...: ... }
---------------------------------------
GSI_Users
| Table_USR | Table_CAT |
-----------------------
| U#1 | C#1 |
| | C#2 |
-----------------------
| U#2 | C#1 |
-----------------------
| U#3 | C#2 |
-----------------------
GSI_Categories
| Table_CAT | Table_USR |
-----------------------
| C#1 | U#1 |
| | U#2 |
-----------------------
| C#2 | U#1 |
| | U#3 |
-----------------------
Queries:
All Categories (incl their Users): Scan GSI_Categories
All Users (including their Categories): Scan GSI_Users
Specific Category (including metadata): Query Table by C#x and SK=DATA
Specific Category and its Users: Query GSI_Categories by C#x
Specific User (including metadata): Query Table by U#x and SK=DATA
Sepcific User and its Categories: Query GSI_Users by U#x

Related

Query array column in BigQuery by condition

I have a table in Bigquery with this format:
+------------+-----------------+------------+-----------------+---------------------------------+
| event_date | event_timestamp | event_name | event_params.key| event_params.value.string_value |
+------------+-----------------+------------+-----------------+---------------------------------+
| 20201110 | 2929929292 | my_event | previous_page | /some-page |
+------------+-----------------+------------+-----------------+---------------------------------+
| | layer | /some-page/layer |
| +-----------------+---------------------------------+
| | session_id | 99292 |
| +-----------------+---------------------------------+
| | user._id | 2929292 |
+------------+-----------------+------------+-----------------+---------------------------------+
| 20201110 | 2882829292 | my_event | previous_page | /some-page |
+------------+-----------------+------------+-----------------+---------------------------------+
| | layer | /some-page/layer |
| +-----------------+---------------------------------+
| | session_id | 29292 |
| +-----------------+---------------------------------+
| | user_id | 229292 |
+-------------------------------------------+-----------------+---------------------------------+
I want to perform a query to get all rows where event_params.value.string_value contains the regex /layer.
I have tried this:
SELECT
"event_params.value.string_value",
FROM `my_project.my_dataset.my_events_20210110`,
UNNEST(event_params) AS event_param
WHERE event_param.key = 'layer' AND
REGEXP_CONTAINS(event_param.value.string_value, r'/layer')
LIMIT 100
But I'm getting this output:
+---------------------------------+
| event_params.value.string_value |
+---------------------------------+
| event_params.value.string_value |
+---------------------------------+
| event_params.value.string_value |
+---------------------------------+
| event_params.value.string_value |
+---------------------------------+
| event_params.value.string_value |
+---------------------------------+
Some ideas of what I'm doing wrong?
You are selecting a string - you should select a column.
The other problem is that you're cross joining the table with its arrays - effectively bloating up the table.
Your solution is to use a subquery in the WHERE clause:
SELECT
* -- Not sure what you actually need from the table ...
FROM `my_project.my_dataset.my_events_20210110`
WHERE
-- COUNT(*)>0 means "if you find more than zero" then return TRUE
(SELECT COUNT(*)>0 FROM UNNEST(event_params) AS event_param
WHERE event_param.key = 'layer' AND
REGEXP_CONTAINS(event_param.value.string_value, r'/layer')
)
LIMIT 100
If you actually want the values from the array your quick solution is removing the quotes:
SELECT
event_params.value.string_value
FROM `my_project.my_dataset.my_events_20210110`,
UNNEST(event_params) AS event_param
WHERE event_param.key = 'layer' AND
REGEXP_CONTAINS(event_param.value.string_value, r'/layer')
LIMIT 100

Queryset - How find a word in foreign key in django?

How find a word in a foreign key?
There are classes:
class Customers(models.Model):
customer = models.CharField(max_length=255, unique=True)
order = models.ForeignKey('Order')
class Orders(models.Model):
orderName = models.CharField(max_length=255)
There are these records in Customer and order tables:
Order:
+-----+------------------+
| id | orderName |
+-----+------------------+
| 1 | Apple juice |
+-----+------------------+
| 2 | Apple pie |
+-----+------------------+
| 3 | Banana juice |
+-----+------------------+
| 4 | Banana pie |
+-----+------------------+
| 5 | Apple ice cream |
+-----+------------------+
| ... | ... |
+-----+------------------+
Customer:
+-----+----------+-------+
| id | Customer | Order |
+-----+----------+-------+
| 1 | A | 2 |
+-----+----------+-------+
| 2 | B | 3 |
+-----+----------+-------+
| 3 | C | 2 |
+-----+----------+-------+
| 4 | G | 1 |
+-----+----------+-------+
| 5 | H | 1 |
+-----+----------+-------+
| ... | ... | ... |
+-----+----------+-------+
I want to get all the records in the customer table that their orders consist of "Apple".
I wrote these codes:
all_apple_orders = Customer.objects.filter(order='Apple')
I got an error message:
Field expected a number but got 'Apple'.
How should change the code?
You can check if the orderName of the related Orders objects contains that name, for example with:
all_apple_orders = Customer.objects.filter(order__orderName__icontains='Apple')
or for a case sensitive match:
all_apple_orders = Customer.objects.filter(order__orderName__contains='Apple')
or you can make use of a regex with word boundaries:
all_apple_orders = Customer.objects.filter(order__orderName__iregex=r'\bApple\b')
or if you want to only match orders that have 'Apple':
all_apple_orders = Customer.objects.filter(order__orderName='Apple')
or for case insenstive matching:
all_apple_orders = Customer.objects.filter(order__orderName__iexact='Apple')
Span your relationship lookup by using __ (double underscore)
all_apple_orders = Customer.objects.filter(order__orderName__icontains='Apple')
Reference: Lookups that span relationships

Find a string in the entire table (all fields, all columns, all rows) in Django

I have a module (table) in my Django app with 24 fields (columns), and I want to search a string in it. I want to see a list that show me which one of the rows has this string in its fields.
Please have a look at this example:
+-----+------+------+---------+------------+------------+------------+-----+-------------+
| id | name | year | country | attribute1 | attribute2 | attribute3 | ... | attribute20 |
+-----+------+------+---------+------------+------------+------------+-----+-------------+
| 1 | Tie | 1993 | USA | Bond | Busy | Busy | ... | Free |
+-----+------+------+---------+------------+------------+------------+-----+-------------+
| 2 | Ness | 1980 | Germany | Free | Busy | Both | ... | Busy |
+-----+------+------+---------+------------+------------+------------+-----+-------------+
| 3 | Both | 1992 | Sweden | Free | Free | Free | ... | Busy |
+-----+------+------+---------+------------+------------+------------+-----+-------------+
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
+-----+------+------+---------+------------+------------+------------+-----+-------------+
| 24 | Lex | 2001 | Russia | Busy | Free | Free | ... | Both |
+-----+------+------+---------+------------+------------+------------+-----+-------------+
What I am looking to get (by using filters, etc.) is something like this: (When I filter the records base on the word "Both" in the entire table and all of the records. Each row that contains "Both" is in the result below)
+----+------+------+---------+------------+------------+------------+-----+-------------+
| id | name | year | country | attribute1 | attribute2 | attribute3 | ... | attribute20 |
+----+------+------+---------+------------+------------+------------+-----+-------------+
| 1 | Ness | 1980 | Germany | Free | Busy | Both | ... | Busy |
+----+------+------+---------+------------+------------+------------+-----+-------------+
| 2 | Both | 1992 | Sweden | Free | Free | Free | ... | Busy |
+----+------+------+---------+------------+------------+------------+-----+-------------+
| 3 | Lex | 2001 | Russia | Busy | Free | Free | ... | Both |
+----+------+------+---------+------------+------------+------------+-----+-------------+
You can see that the string ("Both") appears in different rows in different columns. (one "Both" is under the column "attribute3", the other "Both" is under column "Name", and the last "Both" is under column "attribute20".
How you get this result in Django by queryset?
Thanks
Assuming you have modeled the above table as a Django model named Person
from django.db.models import Q
query_text = "your search string"
Person.objects.filter(
Q(name__contains=query_text) |
Q(year__contains=query_text) |
Q(attribute1__contains=query_text)
and so on for all your attributes
)
The above code will do a case sensitie search. if instead you want it to be case insenssitive search, use name__icontains instead of say name__contains in the above code.
As suggested by #rchurch4 in comment and based on this so answer, here's how one could search the entire table with fewer lines of code:
from functools import reduce
from operators import or_
all_fields = Person._meta.get_fields()
search_fields = [i.name for i in all_fields]
q = reduce(or_, [Q(**{'{}__contains'.format(f): search_text}) for f in search_fields], Q())
Person.objects.filter(q)

django Queryset exclude() multiple data

i have database scheme like this.
# periode
+------+--------------+--------------+
| id | from | to |
+------+--------------+--------------+
| 1 | 2018-04-12 | 2018-05-11 |
| 2 | 2018-05-12 | 2018-06-11 |
+------+--------------+--------------+
# foo
+------+---------+
| id | name |
+------+---------+
| 1 | John |
| 2 | Doe |
| 3 | Trodi |
| 4 | son |
| 5 | Alex |
+------+---------+
#bar
+------+---------------+--------------+
| id | employee_id | periode_id |
+------+---------------+--------------+
| 1 | 1 |1 |
| 2 | 2 |1 |
| 3 | 1 |2 |
| 4 | 3 |1 |
+------+---------------+--------------+
I need to show employee that not in salary.
for now I do like this
queryset=Bar.objects.all().filter(periode_id=1)
result=Foo.objects.exclude(id=queryset)
but its fail, how do filter employee list not in salary?...
Well here you basically want the foos such that there is no period_id=1 in the Bar table.
We can let this work with:
ex = Bar.objects.all().filter(periode_id=1).values_list('employee_id', flat=True)
result=Foo.objects.exclude(id__in=ex)

Count same field values in Django queryset

I have a Django model with three fields: product, condition and quantity with data such as:
| Product | Condition | Quantity |
+---------+-----------+----------+
| A | new | 2 |
| A | new | 3 |
| A | new | 4 |
| A | old | 1 |
| A | old | 2 |
| B | new | 2 |
| B | new | 3 |
| B | new | 1 |
| B | old | 4 |
| B | old | 2 |
I'd like to sum the quantities of the entries where product and condition are equal:
| Product | Condition | Quantity |
+---------+-----------+----------+
| A | new | 9 |
| A | old | 3 |
| B | new | 6 |
| B | old | 6 |
This answer helps to count entries with the same field value, but I need to count two fields.
How could I implement this?
from django.db.models import Sum
Model.objects.values('product', 'condition').order_by().annotate(Sum('quantity'))