Limited some column in queryset django - django

I tried many ways but all failed when trying to get some special columns in model.
My code:
or I use defer
test_detail=testimport.objects.defer("so_hd","noi_dung")
or I use only
test_detail=testimport.objects.only("so_hd","noi_dung")
or even use filter and defer/only
test_detail=testimport.objects.filters().def("so_hd","noi_dung")
When I print print(test_detail.values("ten_kh")), there are still results.
Because there are some columns I rarely use, so I want to save memory and improve speed if don't get there columns
Please help me

I think you might want to select specific columns of a table:
for this you need to use:
data = Model.objects.values('column_1', 'column_2')

Related

Google Sheets Array formula for counting the number of values in each column

I'm trying to create an array formula to auto-populate the total count of values for each column as columns are added.
I've tried doing this using a combination of count and indirect, as well as tried my hand at query, but I can't seem to get it to show unique value counts for each column.
This is my first time attempting to use query, and at first it seemed possible from reading through the documentation on the query language, but I haven't been able to figure it out.
Here's the shared document: https://docs.google.com/spreadsheets/d/15VwsL7uTsORLqBDrnT3VdwAWlXLh-JgoJVbz7wkoMAo/edit?usp=sharing
I know I can do this by writing a custom function in apps script, but I'd like to use the built-in functions if I can for performance reasons (there is going to be a lot of data), and I want quick refresh rates.
try:
=ARRAYFORMULA(IF(B5:5="",,TRANSPOSE(MMULT(TRANSPOSE(N(B6:99<>"")), SIGN(ROW(B6:99))))))
In B3 try
=ArrayFormula(IF(LEN(B5:5), COUNTIF(IF(B6:21<>"", COLUMN(B6:21)), COLUMN(B6:21)),))

How to write the query for this requirement?

I have several hundred thousand svn commit record in my django database, each record save the related info of each commit(like BugID,LinesChanged,SubmitWeek ...)
I want to summary each field info of the records and create the report according to the SubmitWeek field like the following :
I iterate the records and operate the related field value currently , I want to know if there is a more succinct way to define the query and extract the summary? Many thanks
Your question is a bit vague.
If you are looking for a way to form your queries more specific to make Django do more joins and less separate queries, have a look at:
values() and values_list() of the QueryManager
If you want to make Django fetch related objects at once and not in separate queries, have a look at:
prefetch_related() and select_related()
If you want to update data more efficiently, have a look at:
F() https://docs.djangoproject.com/en/1.9/ref/models/expressions/#django.db.models.F
refer to the manual , I used the following statements and it seems works well , thanks Risadinha anyway :)
# Sum all the records's LinesChanged value
SVN_Commit.objects.filter(my filter).aggregate(Sum('LinesChanged'))
# Get the unique SubmitWeek List
SVN_Commit.objects.filter(my filter).values_list('SubmitWeek', flat=True).order_by('SubmitWeek').distinct()

How to subtract 2 columns with dtype = object within data frame to form a new column of the difference pandas

I have a merge data frame(mdf) which the 2 data frames are retrieved from SQL. I wish to create a new col within mdf which will be the subtraction of existing 2 columns.
I'm not sure what you mean by a "merge data frame," but here's a sketch of what you might be after. Please elaborate a little your question so it will be more useful to others.
df = pd.read_sql('select ....', some_sql_connection)
df['difference'] = df['some column name'] - df['another column name']
Also, referring to the title of your question where you mention dtype=object, data extracted from a SQL database sometimes defaults to the generic object datatype, even if it is actually numeric. (This is not ideal, and better handling of datatypes to and from SQL databases is being actively improved for a future release of pandas.)
For now, before manipulating your data, you might want to run df.convert_objects(convert_numeric=True) if you have all numerical data. See documentation.

Custom Date Aggregate Function

I want to sort my Store models by their opening times. Store models contains is_open function which controls Store's opening time ranges and produces a boolean if it's open or not. The problem is I don't want to sort my queryset manually because of efficiency problem. I thought if I write a custom annotate function then I can filter the query more efficiently.
So I googled and found that I can extend Django's aggregate class. From what I understood, I have to use pre-defined sql functions like MAX, AVG etc. The thing is I want to check that today's date is in a given list of time intervals. So anyone can help me that which sql name should I use ?
Edit
I'd like to put the code here but it's really a spaghetti one. One pages long code only generates time intervals and checks the suitable one.
I want to avoid :
alg= lambda r: (not (s.is_open() and s.reachable))
sorted(stores,key=alg)
and replace with :
Store.objects.annotate(is_open = CheckOpen(datetime.today())).order_by('is_open')
But I'm totally lost at how to write CheckOpen...
have a look at the docs for extra

How to limit columns returned by Django query?

That seems simple enough, but all Django Queries seems to be 'SELECT *'
How do I build a query returning only a subset of fields ?
In Django 1.1 onwards, you can use defer('col1', 'col2') to exclude columns from the query, or only('col1', 'col2') to only get a specific set of columns. See the documentation.
values does something slightly different - it only gets the columns you specify, but it returns a list of dictionaries rather than a set of model instances.
Append a .values("column1", "column2", ...) to your query
The accepted answer advising defer and only which the docs discourage in most cases.
only use defer() when you cannot, at queryset load time, determine if you will need the extra fields or not. If you are frequently loading and using a particular subset of your data, the best choice you can make is to normalize your models and put the non-loaded data into a separate model (and database table). If the columns must stay in the one table for some reason, create a model with Meta.managed = False (see the managed attribute documentation) containing just the fields you normally need to load and use that where you might otherwise call defer(). This makes your code more explicit to the reader, is slightly faster and consumes a little less memory in the Python process.