Like many others, I've been learning web development on django through building a test app. I've the basic models set up. I've populated a few of the tables with the absolute minimum data needed for further testing though using fixtures.
Now for a different table, I want to create data tuples through a custom management command which takes the required arguments. If this works as expected, I'll save the created data to the database by adding the --save option.
The syntax of the command is like this
create_raw_data owner_id temperature [--save]
where owner_id is required and temperature (in C) is optional. Within the Handle method, I'm using factory boy to create the raw_data with the given arguments etc.
I did have some issues but searching on SO, google, django docs etc, I've got the command working fine.
EXCEPT when I input a negative temperature...
Then I get the following error
Usage: C:\test\manage.py create_raw_data [options]
Creates a RawData object. Usage: create_raw_data owner_id temperature [--save]
C:\test\manage.py: error: no such option: -5
The code I have for parsing the args is like this
for index, item in enumerate(args):
if index == 0:
owner_id = int(item)
else index == 1:
temp = int(item)
I put a print(args) as the 1st line inside Handle but it seems the control is not even reaching here.
I'm not sure what is wrong... please help...
Thanks a lot
got the issue fixed so providing an answer to others who may come across this.
The issue was with parse_args method of optparse. I've read in a number of places that though optparse is deprecated and instead argparse is recommended, django recommends using optparse since that is what it uses. Long story short, the link at link suggested a few alternatives and using create_raw_data 1 -- -5 works as expected. So I did get a workaround. Thanks.
Related
I am trying to save an array of text containing category types for a hotel system which looks something like this ['category-1', category-2', category-3, category-4] . I am using category_type = ArrayField(models.CharField(max_length=200),null=True) in my models.py
The error i get is
malformed array literal: "" LINE 1: ..., '{category-1,
category-2}'::varchar(200)[], ''::varcha...
^ DETAIL: Array value must start with "{" or dimension information.
The error persist even after processing python list from ['category-1', category-2', category-3, category-4] to {category-1, category-2, category-3, category-4}.
I have gone through postgresql documentation and have found very limited help,
https://pganalyze.com/docs/log-insights/app-errors/U114 this is something similar posted to what i am facing problem with.
Could someone please tell me what am i doing wrong? Any help would be appreciated.
EDIT:
Following is in my View.py
hotel_category=categoryTable(category_type=categorytype)
hotel_category.save()
and i am using categorytype=request.POST.getlist('category-type') in my Views.py to get it from the POST request after user submits the form. This returns a Python list that i have mentioned above, i have manipulated this list to match PostgreSQL ArrayField with '{','}' but i still have this error. If there is anything else you would like me to add, please let me know. :)
This is an update/answer to my question for anyone who faces this issue in the future. After struggling to find help from different resources, i decided to use JSON string to store my python list.
I am using :
categorytype = json.dumps(request.POST.getlist('category-type'))
to encode and using JSONDecoder() to fetch from database and decode. I have no idea how would this impact my further development but for now it seems a decent approach since personally i think ArrayFields are not well supported and documented in Django.
I will keep this post updated as i progress further on how this approach has impacted my development.
Have a nice day.
My django app uses update_or_create() to update a bunch of records. In some cases, updates are really few within a ton of records, and it would be nice to know what got updated within those records. Is it possible to know what got updated (i.e fields whose values got changed)? If not, does any one has ideas of workarounds to achieve that?
This will be invoked from the shell, so ideally it would be nice to be prompted for confirmation just before a value is being changed within update_or_create(), but if not that, knowing what got changed will also help.
Update (more context): Thought I'd give more context here. The data in this Django app gets updated through various means (through users coming on the web site, through the admin page, through scripts (run from the shell) that populate data from a csv etc.). The above question is important mostly for the shell scripts that update data from csvs, hence a solution at the database/trigger/signal level may not be helpful here (I guess).
This is what I ended up doing:
for row in reader:
school_obj0, created = Org.objects.get_or_create(school_id = row[0])
if (school_obj0.name != row[1]):
print (school_obj0.name, '==>', row[1])
confirmation = input('proceed? [y/n]: ')
if (confirmation == 'y'):
school_obj1, created = Org.objects.update_or_create(
school_id = row[0], defaults={"name": row[1],})
Happy to know about improvements to this approach (please see the update in the question with more context)
This will be invoked from the shell, so ideally it would be nice to be
prompted for confirmation just before a value is being changed
Unfortunately, databases don't work like that. It's the responsibility of applications to provide this functionality. And django isn't an application. You can however use django to write an application that provides this functionality.
As for finding out whether an object was updated or created, that's what the return value gives you. A tuple where the second value is a flag for update or create
Can anyone guide me towards the right direction as to where I should place a script solely for loading data into ndb. As I wish to upload all data into the gae ndb so that the application could perform query on it.
Right now, the loading of data is in my application. I wish to placed it separately from the main application.
Should it be edited in the yaml file?
EDITED
This is a snippet of the entity and the handler to upload the data into GAE ndb.
I wish to placed this chunk of code separately from my main application .py. Reason being the uploading of this data won't be done frequently and to keep the codes in the main application "cleaner".
class TagTrend_refine(ndb.Model):
tag = ndb.StringProperty()
trendData = ndb.BlobProperty(compressed=True)
class MigrateData(webapp2.RequestHandler):
def get(self):
listOfEntities = []
f = open("tagTrend_refine.txt")
lines = f.readlines()
f.close()
for line in lines:
temp = line.strip().split("\t")
data = TagTrend_refine(
tag = temp[0],
trendData = temp[1]
)
listOfEntities.append(data)
ndb.put_multi(listOfEntities)
For example if I placed the above code in a file called dataLoader.py, where should I call this script to invoke?
In app.yaml alongside my main application(knowledgeGraph.application)?
- url: /.*
script: knowledgeGraph.application
You don't show us the application object (no doubt a WSGI app) in your knowledge.py module, so I can't know what URL you want to serve with the MigrateData handler -- I'll just guess it's something like /migratedata.
So the class TagTrend_refine should be in a separate file (usually called models.py) so that both your dataloader.py, and your knowledge.py, can import models to access it (and models.py will need its own import of ndb of course). (Then of course access to the entity class will be as models.TagTrend_refine -- very basic Python).
Next, you'll complete dataloader.py by defining a WSGI app, e.g, at end of file,
app = webapp2.WSGIApplication(routes=[('/migratedata', MigrateData)])
(of course this means this module will need to import webapp2 as well -- can I take for granted a knowledge of super-elementary Python?).
In app.yaml, as the first URL, before that /.*, you'll have:
url: /migratedata
script: dataloader.app
Given all this, when you visit '/migratedata', your handler will read the "tagTrend_refine.txt" file that you uploaded together with your .py, .yaml, and so on, files in your overall GAE application, and unconditionally create one entity per line of that file (assuming you fix the multiple indentation problems in your code as displayed above, but, again, this is just super-elementary Python -- presumably you've used both tabs and spaces and they show up OK in your editor, but not here on SO... I recommend you use strictly, only spaces, never tabs, in Python code).
However this does seem to be a peculiar task. If /migratedata gets visited twice, it will create duplicates of all entities. If you change the tagTrend_refine.txt and deploy a changed variation, then visit /migratedata... all old entities will stick around and all the new entities will join them. And so forth.
Moreover -- /migratedata is NOT idempotent (if visited more than once it does not produce the same state as running it just once) so it shouldn't be a GET (and now we're on to super-elementary HTTP for a change!-) -- it should be a POST.
In fact I suspect (but I'm really flying blind here, since you see fit to give such tiny amounts of information) that you in fact want to upload a .txt file to a POST handler and do the updates that way (perhaps avoiding duplicates...?). However, I'm no mind reader, so this is about as far as I can go.
I believe I have fully answered the question you posted (though perhaps not the one you meant but didn't express:-) and by SO's etiquette it would be nice to upvote and accept this answer, then, if needed, post another question, expressing MUCH more clearly and completely what you're trying to achieve, your current .py and .yaml (ideally with correct indentation), what they actually do and why you'd like to do something different. For POST vs GET in particular, just study When should I use GET or POST method? What's the difference between them? ...
Alex's solution will work, as long as all you data can be loaded in under 1 minute, as that's the timeout for an app engine request.
For larger data, consider calling the datastore API directly from your own computer where you have the source. It's a bit of a hassle because it's a different API; it's not ndb. But it's still a pretty simple API. Here's some code that calls the API:
https://github.com/GoogleCloudPlatform/getting-started-python/blob/master/2-structured-data/bookshelf/model_datastore.py
Again, this code can run anywhere. It doesn't need to be uploaded to app engine to run.
I am working on a project to expand a testing suite that my company uses. One of this things that was asked of me was to link the website to our Github source for code so that the dev team could continue tracking the issues there instead of trying to look in two places. I was able to do this but the problem is that every time the a bug is reported an issue is opened.
I want to add a field to my Django model that tracks an Issue object (from the github3.py wrapper) that is sent to Github. I want to use this to check if an Issue has already been created in Github by that instance of the BugReport and if it has been created, edit the Issue instead of creating another issue in Github that is a duplicate. Does Django have a something that can handle this sort of reference?
I am using Django 1.3.1 and Python 2.7.1
EDIT
I was able to figure my specific problem out using esauro's suggestions. However, as mkoistinen said, If this problem came up in a program where the work-around was not as easy as this one was, should an object reference be created like I had originally asked about or is that bad practice? and if it is okay to make an object reference like that, how would you do it with the Django models?
I'm the creator of github3.py.
If you want to get the issue itself via a number, there are a couple different ways to do this. I'm not sure how you're interacting with the API but you can do:
import github3
i = githbu3.issue('repo_owner', 'repo_name', issue_number)
or
import github3
r = github3.repository('repo_owner', 'repo_name')
i = r.issue(issue_number)
or
import github3
g = github3.login(client_key='client_key', client_secret='client_secret')
i = g.issue('repo_owner', 'repo_name', issue_number)
# or
r = g.repository('repo_owner', 'repo_name')
i = r.issue(issue_number)
otherwise if you're looking for something that's in an issue without knowing the number:
import github3
r = github3.repository('repo_owner', 'repo_name')
for i in r.iter_issues():
if 'text to search for' in i.body_text:
i.edit('...')
I need to call a python script from C#, I'll be doing it like this:
System.Diagnostics.Process process = new System.Diagnostics.Process();
System.Diagnostics.ProcessStartInfo startInfo = new System.Diagnostics.ProcessStartInfo();
startInfo.WindowStyle = System.Diagnostics.ProcessWindowStyle.Hidden;
startInfo.FileName = "cmd.exe";
startInfo.Arguments = "/C python \"C:\\working\\projects\\python tests\\Sample Data.py\"";
process.StartInfo = startInfo;
process.Start();
but I now need to add in command line arguments. I'm trying to decide between just using sys.argv which seems very simple to implement or to go with argparse. If I will always pass one and only one parameter (a date), is there any advantage to using argparse?
Additional info regarding the problem (slightly tangential to the question):
The date I'm passing is a parameter for a SQL query. I could instead run this query in C# (which I would prefer) but then I will need to pass the result to python via command line arguments which seems to me to be a terrible idea but maybe this is something argparse can handle? The table has two date columns and 4 float columns. Can such a table be passed this way?
The reason I am calling python via cmd.exe and not using IronPython is (A) I only need to pass information once from C# to Python so the communication between the two is very limited and (B) the result of which is a 3D surface plot generated by mplot3d which seems like a huge hassle to make work in IronPython (which actually generally confuses me anyway), so if I am just passing the single date then this doesn't seem unreasonable. But if I could pass that entire table easily, either by a command line argument or else some other not overly complicated method, I would be very interested in hearing how.
Since you're a nice clean slate of knowledgeless bliss, learn argparse. It's piss-easy and replaces sys.arg which is now considered old, archaic, and (probably) deprecated; although you'll find it far more common, for now, because it's been around since Guido was a baby.