Retrieve multiple tiers of data structure - regex

Suppose such a text:
In [1]: import re
In [2]: with open('text.md', 'r') as f:
...: cont = f.read()
In [3]: cont
Out[3]: '- ## First steps[¶](https://docs.djangoproject.com/en/2.0/#first-steps)\n\n Are you new to Django or to programming? This is the place to start!\n\n - **From scratch:** [Overview](https://docs.djangoproject.com/en/2.0/intro/overview/) | [Installation](https://docs.djangoproject.com/en/2.0/intro/install/)\n - **Tutorial:** [Part 1: Requests and responses](https://docs.djangoproject.com/en/2.0/intro/tutorial01/) | [Part 2: Models and the admin site](https://docs.djangoproject.com/en/2.0/intro/tutorial02/) | [Part 3: Views and templates](https://docs.djangoproject.com/en/2.0/intro/tutorial03/) | [Part 4: Forms and generic views](https://docs.djangoproject.com/en/2.0/intro/tutorial04/) | [Part 5: Testing](https://docs.djangoproject.com/en/2.0/intro/tutorial05/) | [Part 6: Static files](https://docs.djangoproject.com/en/2.0/intro/tutorial06/) | [Part 7: Customizing the admin site](https://docs.djangoproject.com/en/2.0/intro/tutorial07/)\n - **Advanced Tutorials:** [How to write reusable apps](https://docs.djangoproject.com/en/2.0/intro/reusable-apps/) | [Writing your first patch for Django](https://docs.djangoproject.com/en/2.0/intro/contributing/)\n\n ## The model layer[¶](https://docs.djangoproject.com/en/2.0/#the-model-layer)\n\n Django provides an abstraction layer (the “models”) for structuring and manipulating the data of your Web application. Learn more about it below:\n\n - **Models:** [Introduction to models](https://docs.djangoproject.com/en/2.0/topics/db/models/) | [Field types](https://docs.djangoproject.com/en/2.0/ref/models/fields/) | [Indexes](https://docs.djangoproject.com/en/2.0/ref/models/indexes/) | [Meta options](https://docs.djangoproject.com/en/2.0/ref/models/options/) | [Model class](https://docs.djangoproject.com/en/2.0/ref/models/class/)\n - **QuerySets:** [Making queries](https://docs.djangoproject.com/en/2.0/topics/db/queries/) | [QuerySet method reference](https://docs.djangoproject.com/en/2.0/ref/models/querysets/) | [Lookup expressions](https://docs.djangoproject.com/en/2.0/ref/models/lookups/)\n - **Model instances:** [Instance methods](https://docs.djangoproject.com/en/2.0/ref/models/instances/) | [Accessing related objects](https://docs.djangoproject.com/en/2.0/ref/models/relations/)\n - **Migrations:** [Introduction to Migrations](https://docs.djangoproject.com/en/2.0/topics/migrations/) | [Operations reference](https://docs.djangoproject.com/en/2.0/ref/migration-operations/) | [SchemaEditor](https://docs.djangoproject.com/en/2.0/ref/schema-editor/) | [Writing migrations](https://docs.djangoproject.com/en/2.0/howto/writing-migrations/)\n - **Advanced:** [Managers](https://docs.djangoproject.com/en/2.0/topics/db/managers/) | [Raw SQL](https://docs.djangoproject.com/en/2.0/topics/db/sql/) | [Transactions](https://docs.djangoproject.com/en/2.0/topics/db/transactions/) | [Aggregation](https://docs.djangoproject.com/en/2.0/topics/db/aggregation/) | [Search](https://docs.djangoproject.com/en/2.0/topics/db/search/) | [Custom fields](https://docs.djangoproject.com/en/2.0/howto/custom-model-fields/) | [Multiple databases](https://docs.djangoproject.com/en/2.0/topics/db/multi-db/) | [Custom lookups](https://docs.djangoproject.com/en/2.0/howto/custom-lookups/) |[Query Expressions](https://docs.djangoproject.com/en/2.0/ref/models/expressions/) | [Conditional Expressions](https://docs.djangoproject.com/en/2.0/ref/models/conditional-expressions/) | [Database Functions](https://docs.djangoproject.com/en/2.0/ref/models/database-functions/)\n - **Other:** [Supported databases](https://docs.djangoproject.com/en/2.0/ref/databases/) | [Legacy databases](https://docs.djangoproject.com/en/2.0/howto/legacy-databases/) | [Providing initial data](https://docs.djangoproject.com/en/2.0/howto/initial-data/) | [Optimize database access](https://docs.djangoproject.com/en/2.0/topics/db/optimization/) | [PostgreSQL specific features](https://docs.djangoproject.com/en/2.0/ref/contrib/postgres/)'
It's chapters are retrieved by,
In [9]: chapters = re.findall(r'## (.+)\[', cont)
In [10]: chapters
Out[10]: ['First steps', 'The model layer']
It's sections are obtained by,
In [21]: sections = re.findall(r'- \*\*(.+)\*\*',cont)
In [23]: sections
Out[23]:
['From scratch:',
'Tutorial:',
'Advanced Tutorials:',
'Models:',
'QuerySets:',
'Model instances:',
'Migrations:',
'Advanced:',
'Other:']
I'd like to output a data structure like:
['First steps',['From scratch:',
'Tutorial:',
'Advanced Tutorials:'],
'The model layer',['Models:',
'QuerySets:',
'Model instances:',
'Migrations:',
'Advanced:',
'Other:']]
How to acomplish such a task?

Find both chapters and sections simultanously:
>>> content = re.findall(r'## (.+)\[|- \*\*(.+)\*\*', cont)
Then put them in your desired structure:
>>> structure = []
>>> for c, s in results:
if c:
structure.extend([c, []])
elif s:
structure[-1].append(s)
This results in:
>>> structure
['First steps', ['From scratch:', 'Tutorial:', 'Advanced Tutorials:'], 'The model layer', ['Models:', 'QuerySets:', 'Model instances:', 'Migrations:', 'Advanced:', 'Other:']]

Related

Extract a string from a another column using regexp_extract

I want to get data of s[0] from "column1":
sada/object=fan/sn=dadfs/s[0]=gsf,sdfs,sfdgs,/s[1]=dfsd,sdg,hte,/redirect=sdgfd/
Output should be values of s[0]
gsf,sdfs,sfdgs
I was trying to do using \ and it's not working
REGEXP_EXTRACT(column1, 's\\[0\\] = ([^&]+)')
This is in PySpark.
Input:
from pyspark.sql import functions as F
# Spark dataframe:
df = spark.createDataFrame([("sada/object=fan/sn=dadfs/s[0]=gsf,sdfs,sfdgs,/s[1]=dfsd,sdg,hte,/redirect=sdgfd/",)], ["column1"])
# SQL table:
df.createOrReplaceTempView("df")
PySpark:
df.select(F.regexp_extract('column1', r's\[0\]=(.*?),/', 1).alias('match')).show()
# +--------------+
# | match|
# +--------------+
# |gsf,sdfs,sfdgs|
# +--------------+
SQL:
spark.sql("select regexp_extract(column1, r's\\[0\\]=(.*?),/', 1) as match from df").show()
# +--------------+
# | match|
# +--------------+
# |gsf,sdfs,sfdgs|
# +--------------+

Optimize code of a function for a search filter in django with variable numbers of keywords - too much code, i'm a beginner

Hello great community,
i'm learning django/python development, i'm training myself with development of a web app for asset inventory.
i've made a search filter, to give result of (for example) assets belonging to a specific user, or belonging to a specific department, or belonging to a specific brand, model or category (computers, desks, ecc..) there are many fields that mostly are foreign tables, main table is "Cespiti" that mean Asset in italian
now (after a lot) i've done with multiple keyword search (for example) somebody type in the search box the department and the category and obtain the relative results (for example all desks in a specific department, or all computer of a specific model in a specific department).
i've made it in a "if" check form that split the keyword in single words, count it and apply progressive filtering on the results of the previous keys in sequence.
but i'm not satisfact of my code, i think it's too much "hardcoded" and instead of creating an IF condition for each number of keyword (from 1 to 3) i wish like to code something that is not so dependent in the number of keyword, but is free.
Here's the code of the view, i hope someone can give me the right direction.
def SearchResults(request):
query = request.GET.get('q')
chiave =query.split()
lunghezza = int((len(chiave)))
if lunghezza == 1:
object_list = Cespiti.objects.filter(
Q(proprietario__cognome__icontains=chiave[0]) |
Q(proprietario__nome__icontains=chiave[0]) |
Q(categoria__nome__icontains=chiave[0]) |
Q(marca__nome__icontains=chiave[0]) |
Q(modello__nome__icontains=chiave[0]) |
Q(reparto__nome__icontains=chiave[0]) |
Q(matricola__icontains=chiave[0])
).distinct
elif lunghezza == 2:
object_list = Cespiti.objects.filter(
Q(proprietario__cognome__icontains=chiave[0]) |
Q(proprietario__nome__icontains=chiave[0]) |
Q(categoria__nome__icontains=chiave[0]) |
Q(marca__nome__icontains=chiave[0]) |
Q(modello__nome__icontains=chiave[0]) |
Q(reparto__nome__icontains=chiave[0]) |
Q(matricola__icontains=chiave[0])
).filter(Q(proprietario__cognome__icontains=chiave[1]) |
Q(proprietario__nome__icontains=chiave[1]) |
Q(categoria__nome__icontains=chiave[1]) |
Q(marca__nome__icontains=chiave[1]) |
Q(modello__nome__icontains=chiave[1]) |
Q(reparto__nome__icontains=chiave[1]) |
Q(matricola__icontains=chiave[1])
).distinct
elif lunghezza == 3:
object_list = Cespiti.objects.filter(
Q(proprietario__cognome__icontains=chiave[0]) |
Q(proprietario__nome__icontains=chiave[0]) |
Q(categoria__nome__icontains=chiave[0]) |
Q(marca__nome__icontains=chiave[0]) |
Q(modello__nome__icontains=chiave[0]) |
Q(reparto__nome__icontains=chiave[0]) |
Q(matricola__icontains=chiave[0])
).filter(Q(proprietario__cognome__icontains=chiave[1]) |
Q(proprietario__nome__icontains=chiave[1]) |
Q(categoria__nome__icontains=chiave[1]) |
Q(marca__nome__icontains=chiave[1]) |
Q(modello__nome__icontains=chiave[1]) |
Q(reparto__nome__icontains=chiave[1]) |
Q(matricola__icontains=chiave[1])
).filter(Q(proprietario__cognome__icontains=chiave[2]) |
Q(proprietario__nome__icontains=chiave[2]) |
Q(categoria__nome__icontains=chiave[2]) |
Q(marca__nome__icontains=chiave[2]) |
Q(modello__nome__icontains=chiave[2]) |
Q(reparto__nome__icontains=chiave[2]) |
Q(matricola__icontains=chiave[2])).distinct
context = {
'object_list': object_list, 'query' : query,
}
return render(request, 'search_results.html', context=context)
One way you could do it would be to separate the step of building the Q objects from the view method. That way it could be performed in a loop:
def generate_search_query_params(word):
return (
Q(proprietario__cognome__icontains=word) |
Q(proprietario__nome__icontains=word) |
Q(categoria__nome__icontains=word) |
Q(marca__nome__icontains=word) |
Q(modello__nome__icontains=word) |
Q(reparto__nome__icontains=word) |
Q(matricola__icontains=word)
)
def SearchResults(request):
query = request.GET.get('q')
queryset = Cespiti.objects.all()
for word in query.split():
queryset = queryset.filter(
generate_search_query_params(word)
)
object_list = queryset.distinct()
context = {
'object_list': object_list, 'query' : query,
}
return render(request, 'search_results.html', context=context)
thanks, i really appreciate your suggestion,
i reach to insert in a loop, now number of keywords is unlimited and it's not hardcode ( thanks a lot), i've thinked about Abdul idea and Damon solution, i wish to avoid the initial ".object.all() so i've arranged in this way: the first "level" is fixed, so i can avoid the .all() and all sublevels of filtering are looped, what do you think about?
def SearchResults(request):
query = request.GET.get('q')
chiave =query.split()
lunghezza = int((len(chiave)))
object_list = Cespiti.objects.filter(
Q(proprietario__cognome__icontains=chiave[0]) |
Q(proprietario__nome__icontains=chiave[0]) |
Q(categoria__nome__icontains=chiave[0]) |
Q(marca__nome__icontains=chiave[0]) |
Q(modello__nome__icontains=chiave[0]) |
Q(reparto__nome__icontains=chiave[0]) |
Q(matricola__icontains=chiave[0])
)
for I in range(1,lunghezza):
print(I)
object_list = object_list.filter(
Q(proprietario__cognome__icontains=chiave[I]) |
Q(proprietario__nome__icontains=chiave[I]) |
Q(categoria__nome__icontains=chiave[I]) |
Q(marca__nome__icontains=chiave[I]) |
Q(modello__nome__icontains=chiave[I]) |
Q(reparto__nome__icontains=chiave[I]) |
Q(matricola__icontains=chiave[I])
)
context = {
'object_list': object_list, 'query' : query,
}
return render(request, 'search_results.html', context=context)

Django can't find table, that exists in postgreSQL database

I'm new to database programming, apologies if I ask something simply.
I newly add few tables into my DB use Django model and migrations, now I'm using python bring data and print on scripts
Now to point of my error:
DB is connected successfully
Failed to execute database program
relation "cgi_limit" does not exist
LINE 1: SELECT * FROM CGI_limit
^
connection of DB had close successfully
now I check twice on naming. I try others tables such as auth_user and its was able print the table contents and I check to see if table exit in my DB as shown below;
Farm=# SELECT * FROM pg_tables;
schemaname | tablename | tableowner | tablespace | hasindexes | hasrules | hastriggers | rowsecurity
public | django_session | FAT | | t | f | f | f
public | auth_permission | FAT | | t | f | t | f
public | auth_user_user_permissions | FAT | | t | f | t | f
public | auth_user | FAT | | t | f | t | f
public | django_admin_log | FAT | | t | f | t | f
public | CGI_ambient | FAT | | t | f | f | f
public | CGI_tank_system | FAT | | t | f | f | f
public | CGI_limit | FAT | | t | f | f | f
I my python code that render the DB;
#import liberys
import psycopg2 as pg2
from datetime import timedelta, datetime, date
############################################
# Function codes
def getDbConnection():
#Get Database connection
try:
connection =pg2.connect(user='FAT',
password='*******',
host='',
port='5432',
database='Farm')
print ("DB is connected succefully")
return connection
except(Exception, pg2.DatabaseError) as error:
print("Failed to connect to database")
def closeDbConnection(connection):
#Close Database connection
try:
connection.close()
print("connection of DB had close succefully")
except(Exception, pg2.DatabaseError) as error:
print("Failed to close database connection")
def DisplayDBdata():
try:
connection = getDbConnection()
cursor = connection.cursor()
query = 'SELECT * FROM "CGI_limit"'
cursor.execute(query,)
records = cursor.fetchall()
for row in records:
print("date: = ", row[1])
except(Exception, pg2.DatabaseError) as error:
print("Failed to execute database program")
print(error)
finally:
closeDbConnection(connection)
#############################################################
#code to be excuted
#DeleteDBdata()
DisplayDBdata() #for testing only
#end of code thats excute
I'm stump of what I should do. I did some google search and result only naming
I appreciate if you could help me
Postgres does not like capitalized table names. You will need to put the table name in quotes to make it work. I would recommend sticking with lowercase names.
query = 'SELECT * FROM "CGI_limit"'
Documentation link

How to make multi filter on django orm?

I have a table with fields:
No. | name | utc_date | utc_time |
------------------------------------
1 | John | 181014 | 140104.12 |
2 | Mark | 181014 | 152312.01 |
3 | Kim | 181015 | 092345.23 |
4 | Jane | 181015 | 234543.32 |
How can I create Django ORM query like that: ?
(utc_date >= 181014, utc_time >=150000.00) AND (utc_date <= 181015, utc_time <= 150000.00 )
*I tried to make as shown below, but it doesn't work:
MyTable.objects.filter(utc_date__gte=181014,
utc_date__lte=181015,
utc_time__gte=150000.00,
utc_time__lte=150000.00)
For multiple filters, use Q objects.
In your case it should look like
from django.db.models import Q
MyTable.objects.filter(Q(utc_date__gte=181014) &
Q(utc_date__lte=181015) &
Q(utc_time__gte=150000.00) &
Q(utc_time__lte=150000.00))
I guess you have separation datetime to datefield and timefield which name is utc_date and utc_time, you wish is to filter data in datetime range,do like:
# first get data in date range:
data = MyTable.objects.filter(utc_date__gte=181014, utc_date__lte=181015)
# second exclude other
data1 = data.exclude(utc_date=181014, utc_time__lt=150000.00)
data2 = data1.exclude(utc_date=181015, utc_time__gt=150000.00)
data2 is what you need.
The main idea is if i want get data in 2018-10-14 15:00 ~ 2018-10-15 15:00,I get all data in 2018-10-14 00:00 ~ 2018-10-15 24:00,then delete data in 2018-10-14 00:00 ~ 2018-10-14 15:00 and 2018-10-15 15:00 ~ 2018-10-15 24:00
Your solution doesn't work because adding multiple criteria inside a .filter() just ANDs them all together.
I think the simplest and most readable way to solve this is to look at each date individually. We'll use Q objects to enable logical ORs:
from django.db.models import Q
MyTable.objects.filter(Q(utc_date=181014, utc_time__gte=150000.00) |
Q(utc_date=181015, utc_time__lte=150000.00))

Compare fields within relationship on Django ORM

I have two models, route and stop.
A route can have several stop, each stop have a name and a number. On same route, stop.number are unique.
The problem:
I need to search which route has two different stops and one stop.number is less than the other stop.number
Consider the following models:
class Route(models.Model):
name = models.CharField(max_length=20)
class Stop(models.Model):
route = models.ForeignKey(Route)
number = models.PositiveSmallIntegerField()
location = models.CharField(max_length=45)
And the following data:
Stop table
| id | route_id | number | location |
|----|----------|--------|----------|
| 1 | 1 | 1 | 'A' |
| 2 | 1 | 2 | 'B' |
| 3 | 1 | 3 | 'C' |
| 4 | 2 | 1 | 'C' |
| 5 | 2 | 2 | 'B' |
| 6 | 2 | 3 | 'A' |
In example:
Given two locations 'A' and 'B', search which routes have both location and A.number is less than B.number
With the previous data, it should match route id 1 and not route id 2
On raw SQL, this works with a single query:
SELECT
`route`.id
FROM
`route`
LEFT JOIN `stop` stop_from ON stop_from.`route_id` = `route`.`id`
LEFT JOIN `stop` stop_to ON stop_to.`route_id` = `route`.`id`
WHERE
stop_from.`stop_location_id` = 'A'
AND stop_to.`stop_location_id` = 'B'
AND stop_from.stop_number < stop_to.stop_number
Is this possible to do with one single query on Django ORM as well?
Generally ORM frameworks like Django ORM, SQLAlchemy and even Hibernate is not design to autogenerate most efficient query. There is a way to write this query only using Model objects, however, since I had similar issue, I would suggest to use raw query for more complex queries. Following is link for Django raw query:
[https://docs.djangoproject.com/en/1.11/topics/db/sql/]
Although, you can write your query in many ways but something like following could help.
from django.db import connection
def my_custom_sql(self):
with connection.cursor() as cursor:
cursor.execute("SELECT
`route`.id
FROM
`route`
LEFT JOIN `stop` stop_from ON stop_from.`route_id` = `route`.`id`
LEFT JOIN `stop` stop_to ON stop_to.`route_id` = `route`.`id`
WHERE
stop_from.`stop_location_id` = %s
AND stop_to.`stop_location_id` = %s
AND stop_from.stop_number < stop_to.stop_number", ['A', 'B'])
row = cursor.fetchone()
return row
hope this helps.