Let's say that we have a database with existing data, the data is updated from a bash script and there is no related model on Django for that. Which is the best way to create an endpoint on Django to be able to perform a GET request so to retrieve the data?
What I mean is, that if there was a model we could use something like:
class ModelList(generics.ListCreateAPIView):
queryset = Model.objects.first()
serializer_class = ModelSerializer
The workaround that I tried was to create an APIView and inside that APIView to do something like this:
class RetrieveData(APIView):
def get(self, request):
conn = None
try:
conn = psycopg2.connect(host=..., database=..., user=..., password=..., port=...)
cur = conn.cursor()
cur.execute(f'Select * from ....')
fetched_data = cur.fetchone()
cur.close()
res_list = [x for x in fetched_data]
json_res_data = {"id": res_list[0],
"date": res_list[1],
"data": res_list[2]}
return Response({"data": json_res_data)
except Exception as e:
return Response({"error": 'Error'})
finally:
if conn is not None:
conn.close()
Although I do not believe that this is a good solution, also is a bit slow ~ 2 sec per request. Apart from that, if for example, many Get requests are made at the same time isn't that gonna create a problem on the DB instance, e.g lock table etc?
So I was wondering which is a better / best solution for this kind of problems.
Appreciate your time!
Related
I am trying to create a REST API for a database that already exists. The problem is that the data on the database are refreshed from a bash script every hour, so there is no related model for these data. So I am working on creating a GET request on Django so to be able to retrieve the data. Currently I am using an APIView like this:
class RetrieveData(APIView):
def get(self, request):
conn = psycopg2.connect(host=..., database=..., user=..., password=..., port=...)
cur = conn.cursor()
cur.execute(f'Select * from ....')
fetched_data = cur.fetchone()
cur.close()
res_list = [x for x in fetched_data]
json_res_data = {"id": res_list[0],
"date": res_list[1],
"data": res_list[2]}
conn.close()
The problem that I have is that connecting on the database every time so to retrieve the data and then return the response is quite slow ~ 2sec/request. Also I am afraid in case of many requests made at the same time, how is that going to work.
So the question that I have is if there is any suggestions or any solutions that you propose.
I would like to know the number of sql queries which were executed on a psycopg2 connection.
Is there a way to get this number?
I would like to warn if a http request produces too many statements.
I am running a django application. If DEBUG is True, then I have connection.queries. But I would like to get this value from a production server
Update
I want numbers (statistics) from the prod environment. This question is not about debugging a particular http request.
Have a look at django-silk. It is a profiling tool that records metrics like response times and the number of queries.
If you want to roll you own solution and you are using Django 2.0, you can create a middleware with a connection wrapper. The documentation even showcases a QueryLogger class:
import time
from django.db import connection
class QueryLogger:
def __init__(self):
self.queries = []
def __call__(self, execute, sql, params, many, context):
current_query = {'sql': sql, 'params': params, 'many': many}
start = time.time()
try:
result = execute(sql, params, many, context)
except Exception as e:
current_query['status'] = 'error'
current_query['exception'] = e
raise
else:
current_query['status'] = 'ok'
return result
finally:
duration = time.time() - start
current_query['duration'] = duration
self.queries.append(current_query)
class QueryLogginMiddleware:
def __init__(self, get_response):
self.get_response = get_response
def __call__(self, request):
ql = QueryLogger()
with connection.execute_wrapper(ql):
response = self.get_response(request)
# do something with ql.queries here
return response
The amount of queries made on Production and Development are the same, if you have the same environment on your database and everything else.
I recommend you to use Django Debug Toolbar as mentioned, copy see about how many queries your View are doing and rethink your code based on that, if you want to see about those queries performance i recommend you to use the explain command from postgresql.
I usually, copy the query and paste it with explain inside my postgreaql database shell. See this: http://recordit.co/rGZ2SAo7PX
New to django. I'm doing my best to implement CRUD using Django, mongodb, and mongoengine. I'm able to query the database and render my page with the correct information from the database. I'm also able to change some document fields using javascript and do an Ajax POST back to the original Django View class with the correct csrf token.
The data payload I'm sending back and forth is a list of each Document Model (VirtualPageModel) serialized to json (each element contains ObjectId string along with the other specific fields from the Model.)
This is where it starts getting murky. In order to update the original document in my View Class post function I do an additional query using the object id and loop through the dictionary items, setting the respective fields each time. I then call save and any new data is pushed to the Mongo collection correctly.
I'm not sure if what I'm doing to update existing documents is correct or in the spirit of django's abstracted database operations. The deeper I get the more I feel like I'm not using some fundamental facility earlier on (provided by either django or mongoengine) and because of this I'm having to make things up further downstream.
The way my code is now I would not be able to create a new document (although that's easy enough to fix). However what I'm really curious about is how I would know when to delete a document which existed in the initial query, but was removed by the user/javascript code? Am I overthinking things and the contents of my POST should contain a list of ObjectIds to delete (sounds like a security risk although this would be an internal tool.)
I was assuming that my View Class might maintain either the original document objects (or simply ObjectIds) it queried and I could do my comparisions off of that set, but I can't seem to get that information to persist (as a class variable in VolumeSplitterView) from its inception to when I received the POST at the end.
I would appreciate if anyone could take a look at my code. It really seems like the "ease of use" facilities of Django start to break when paired with Mongo and/or a sufficiently complex Model schema which needs to be directly available to javascript as opposed to simple Forms.
I was going to use this dev work to become django battle-hardened in order to tackle a future app which will be much more complicated and important. I can hack on this thing all day and make it functional, but what I'm really interested in is anyone's experience in using Django + MongoDB + MongoEngine to implement CRUD on a Database Schema which is not vary Form-centric (think more nested metadata).
Thanks.
model.py: uses mongoengine Field types.
class MongoEncoder(JSONEncoder):
def default(self, o):
if isinstance(o, VirtualPageModel):
data_dict = (o.to_mongo()).to_dict()
if isinstance(data_dict.get('_id'), ObjectId):
data_dict.update({'_id': str(data_dict.get('_id'))})
return data_dict
else:
return JSONEncoder.default(self, o)
class SubTypeModel(EmbeddedDocument):
filename = StringField(max_length=200, required=True)
page_num = IntField(required=True)
class VirtualPageModel(Document):
volume = StringField(max_length=200, required=True)
start_physical_page_num = IntField()
physical_pages = ListField(EmbeddedDocumentField(SubTypeModel),
default=list)
error_msg = ListField(StringField(),
default=list)
def save(self, *args, **kwargs):
print('In save: {}'.format(kwargs))
for k, v in kwargs.items():
if k == 'physical_pages':
self.physical_pages = []
for a_page in v:
tmp_pp = SubTypeModel()
for p_k, p_v in a_page.items():
setattr(tmp_pp, p_k, p_v)
self.physical_pages.append(tmp_pp)
else:
setattr(self, k, v)
return super(VirtualPageModel, self).save(*args, **kwargs)
views.py: My attempt at a view
class VolumeSplitterView(View):
#initial = {'key': 'value'}
template_name = 'click_model/index.html'
vol = None
start = 0
end = 20
def get(self, request, *args, **kwargs):
self.vol = self.kwargs.get('vol', None)
records = self.get_records()
records = records[self.start:self.end]
vp_json_list = []
img_filepaths = []
for vp in records:
vp_json = json.dumps(vp, cls=MongoEncoder)
vp_json_list.append(vp_json)
for pp in vp.physical_pages:
filepath = get_file_path(vp, pp.filename)
img_filepaths.append(filepath)
data_dict = {
'img_filepaths': img_filepaths,
'vp_json_list': vp_json_list
}
return render_to_response(self.template_name,
{'data_dict': data_dict},
RequestContext(request))
def get_records(self):
return VirtualPageModel.objects(volume=self.vol)
def post(self, request, *args, **kwargs):
if request.is_ajax:
vp_dict_list = json.loads(request.POST.get('data', []))
for vp_dict in vp_dict_list:
o_id = vp_dict.pop('_id')
original_doc = VirtualPageModel.objects.get(id=o_id)
try:
original_doc.save(**vp_dict)
except Exception:
print(traceback.format_exc())
Hi I am implementing test cases for my models.
I am using Mongoengine0.9.0 + Django 1.8
My models.py
class Project(Document):
# commented waiting for org-group to get finalize
project_name = StringField()
org_group = ListField(ReferenceField(OrganizationGroup, required=False))
My Serializers.py
class ProjectSerializer(DocumentSerializer):
class Meta:
model = Project
depth = 1
test.py file
def setUp(self):
# Every test needs access to the request factory.
self.factory = RequestFactory()
self.user = User.objects.create_user(
username='jacob', email='jacob#jacob.com', password='top_secret')
def test_post_put_project(self):
"""
Ensure we can create new clients in mongo database.
"""
org_group = str((test_utility.create_organization_group(self)).id)
url = '/project-management/project/'
data = {
"project_name": "googer",
"org_group": [org_group],
}
##import pdb; pdb.set_trace()
factory = APIRequestFactory()
user = User.objects.get(username='jacob')
view = views.ProjectList.as_view()
# Make an authenticated request to the view...
request = factory.post(url, data=data,)
force_authenticate(request, user=user)
response = view(request)
self.assertEqual(response.status_code, 200)
When I am running test cases I am getting this error
(Only lists and tuples may be used in a list field: ['org_group'])
The complete Stack Trace is
ValidationError: Got a ValidationError when calling Project.objects.create().
This may be because request data satisfies serializer validations but not Mongoengine`s.
You may need to check consistency between Project and ProjectSerializer.
If that is not the case, please open a ticket regarding this issue on https://github.com/umutbozkurt/django-rest-framework-mongoengine/issues
Original exception was: ValidationError (Project:None) (Only lists and tuples may be used in a list field: ['org_group'])
Not getting why we cant pass object like this.
Same thing when I am posting as an request to same method It is working for me but test cases it is failing
The tests should be running using multipart/form-data, which means that they don't support lists or nested data.
You can override this with the format argument, which I'm guessing you probably want to set to json. Most likely your front-end is using JSON, or a parser which supports lists, which explains why you are not seeing this.
I am trying to create a REST API with Neo4j and Django in the backend.
The problem is that even when I have Django models using Neo4Django , I can't use frameworks like Tastypie or Piston that normally serialize models into JSON (or XML).
Sorry if my question is confusing or not clear, I am newbie to webservices.
Thanks for you help
EDIT: So I started with Tastypie and followed the tutorial on this page http://django-tastypie.readthedocs.org/en/latest/tutorial.html. I am looking for displaying the Neo4j JSON response in the browser, but when I try to access to http://127.0.0.1:8000/api/node/?format=json I get this error instead:
{"error_message": "'NoneType' object is not callable", "traceback": "Traceback (most recent call last):\n\n File \"/usr/local/lib/python2.6/dist-packages/tastypie/resources.py\", line 217, in wrapper\n response = callback(request, *args, **kwargs)\n\n File \"/usr/local/lib/python2.6/dist-packages/tastypie/resources.py\", line 459, in dispatch_list\n return self.dispatch('list', request, **kwargs)\n\n File \"/usr/local/lib/python2.6/dist-packages/tastypie/resources.py\", line 491, in dispatch\n response = method(request, **kwargs)\n\n File \"/usr/local/lib/python2.6/dist-packages/tastypie/resources.py\", line 1298, in get_list\n base_bundle = self.build_bundle(request=request)\n\n File \"/usr/local/lib/python2.6/dist-packages/tastypie/resources.py\", line 718, in build_bundle\n obj = self._meta.object_class()\n\nTypeError: 'NoneType' object is not callable\n"}
Here is my code :
api.py file:
class NodeResource (ModelResource): #it doesn't work with Resource neither
class meta:
queryset= Node.objects.all()
resource_name = 'node'
urls.py file:
node_resource= NodeResource()
urlpatterns = patterns('',
url(r'^api/', include(node_resource.urls)),
models.py file :
class Node(models.NodeModel):
p1 = models.StringProperty()
p2 = models.StringProperty()
I would advise steering away from passing Neo4j REST API responses directly through your application. Not only would you not be in control of the structure of these data formats as they evolve and deprecate (which they do) but you would be exposing unnecessary internals of your database layer.
Besides Neo4Django, you have a couple of other options you might want to consider. Neomodel is another model layer designed for Django and intended to act like the built-in ORM; you also have the option of the raw OGM layer provided by py2neo which may help but isn't Django-specific.
It's worth remembering that Django and its plug-ins have been designed around a traditional RDBMS, not a graph database, so none of these solutions will be perfect. Whatever you choose, you're likely to have to carry out a fair amount of transformation work to create your application's API.
Django-Tastypie allows to create REST APIs with NoSQL databases as well as mentioned in http://django-tastypie.readthedocs.org/en/latest/non_orm_data_sources.html.
The principle is to use tastypie.resources.Resource and not tastypie.resources.ModelResource which is SPECIFIC to RDBMS, then main functions must be redefined in order to provide a JSON with the desired parameters.
So I took the example given in the link, modified it and used Neo4j REST Client for Python to get an instance of the db and perform requests, and it worked like a charm.
Thanks for all your responses :)
Thanks to recent contributions, Neo4django now supports Tastypie out of the box! I'd love to know what you think if you try it out.
EDIT:
I've just run through the tastypie tutorial, and posted a gist with the resulting example. I noticed nested resources are a little funny, but otherwise it works great. I'm pretty sure the gents who contributed the patches enabling this support also know how to take care of nested resources- I'll ask them to speak up.
EDIT:
As long as relationships are specified in the ModelResource, they work great. If anyone would like to see examples, let me know.
Well my answer was a bit vague so I'm gonna post how a solved the problem with some code:
Assume that I want to create an airport resource with some attributes. I will structure this in 3 different files (for readability reasons).
First : airport.py
This file will contain all the resource attributes and a constructor too :
from models import *
class Airport(object):
def __init__ (self, iata, icao, name, asciiName, geonamesId, wikipedia, id, latitude, longitude):
self.icao = icao
self.iata = iata
self.name = name
self.geonamesId = geonamesId
self.wikipedia = wikipedia
self.id = id
self.latitude = latitude
self.longitude = longitude
self.asciiName = asciiName
This file will be used in order to create resources.
Then the second file : AirportResource.py:
This file will contain the resource attributes and some basic methods depending on which request we want our resource to handle.
class AirportResource(Resource):
iata = fields.CharField(attribute='iata')
icao = fields.CharField(attribute='icao')
name = fields.CharField(attribute='name')
asciiName = fields.CharField(attribute='asciiName')
latitude = fields.FloatField(attribute='latitude')
longitude = fields.FloatField(attribute='longitude')
wikipedia= fields.CharField(attribute='wikipedia')
geonamesId= fields.IntegerField(attribute='geonamesId')
class Meta:
resource_name = 'airport'
object_class = Airport
allowed_methods=['get', 'put']
collection_name = 'airports'
detail_uri_name = 'id'
def detail_uri_kwargs(self, bundle_or_obj):
kwargs = {}
if isinstance(bundle_or_obj, Bundle):
kwargs['id'] = bundle_or_obj.obj.id
else:
kwargs['id'] = bundle_or_obj.id
return kwargs
As mentioned in the docs, if we want to create an API that handle CREATE, GET, PUT, POST and DELETE requests, we must override/implement the following methods :
def obj_get_list(self, bundle, **kwargs) : to GET a list of objects
def obj_get(self, bundle, **kwargs) : to GET an individual object
def obj_create(self, bundle, **kwargs) to create an object (CREATE method)
def obj_update(self, bundle, **kwargs) to update an object (PUT method)
def obj_delete(self, bundle, **kwargs) to delete an object (DELETE method)
(see http://django-tastypie.readthedocs.org/en/latest/non_orm_data_sources.html)
Normally, in ModelResource all those methods are defined and implemented, so they can be used directly without any difficulty. But in this case, they should be customized according to what we want to do.
Let's see an example of implementing obj_get_list and obj_get :
For obj_get_list:
In ModelResource, the data is FIRSTLY fetched from the database, then it could be FILTERED according to the filter declared in META class ( see http://django-tastypie.readthedocs.org/en/latest/interacting.html). But I didn't wish to implement such behavior (get everything then filter), so I made a query to Neo4j given the query string parameters:
def obj_get_list(self,bundle, **kwargs):
data=[]
params= []
for key in bundle.request.GET.iterkeys():
params.append(key)
if "search" in params :
query= bundle.request.GET['search']
try:
results = manager.searchAirport(query)
data = createAirportResources(results)
except Exception as e:
raise NotFound(e)
else:
raise BadRequest("Non valid URL")
return data
and for obj_get:
def obj_get(self, bundle, **kwargs):
id= kwargs['id']
try :
airportNode = manager.getAirportNode(id)
airport = createAirportResources([airportNode])
return airport[0]
except Exception as e :
raise NotFound(e)
and finally a generic function that takes as parameter a list of nodes and returns a list of Airport objects:
def createAirportResources(nodes):
data= []
for node in nodes:
iata = node.properties['iata']
icao = node.properties['icao']
name = node.properties['name']
asciiName = node.properties['asciiName']
geonamesId = node.properties['geonamesId']
wikipedia = node.properties['wikipedia']
id = node.id
latitude = node.properties['latitude']
longitude = node.properties['longitude']
airport = Airport(iata, icao, name, asciiName, geonamesId, wikipedia, id, latitude, longitude)
data.append(airport)
return data
Now the third manager.py : which is in charge of making queries to the database and returning results :
First of all, I get an instance of the database using neo4j rest client framework :
from neo4jrestclient.client import *
gdb= GraphDatabase("http://localhost:7474/db/data/")
then the function which gets an airport node :
def getAirportNode(id):
if(getNodeType(id) == type):
n= gdb.nodes.get(id)
return n
else:
raise Exception("This airport doesn't exist in the database")
and the one to perform search (I am using a server plugin, see Neo4j docs for more details):
def searchAirport(query):
airports= gdb.extensions.Search.search(query=query.strip(), searchType='airports', max=6)
if len(airports) == 0:
raise Exception('No airports match your query')
else:
return results
Hope this will help :)