Related
I'm writing API tests using Postman. I'm organizing them into folders by endpoint, and subfolders by test cases within the endpoint folders. There are multiple cases for each endpoint and for each case there are post calls that set up data prior the the csubject-endpoint call that I'm making assertions against.
I already have 100s of calls in this suite. The test runner, unfortunately, does not provide the folder names in its output, so it's difficult to see at a glance which particular case I am looking at when, for example, it reports a test fail.
Is there a convenient way to obtain the folder names for a given call in its test script? With this, I could prepend the case name to the test name, and that would make my tests vastly more readable.
I think a variable containing the current request path in the folder hierarchy would be the best, but for now I didn't find such a feature.
Instead I may suggest such a workaround solution:
In each folder prerequest scripts you set a variable:
pm.environment.set("folder", "folder1/folder1.1/")
the value of folder variable you have to maintain separately for each folder.
Then on a collection level you put a collection test like this:
pm.test("location: " + pm.environment.get("folder"), true)
After running your collection in the Runner you will get the output
from collection tests at the beginning of the test results for each request
showing the folder location.
The effort of setting folder variables pays off when you estimate the results of complex tests. I used to change the names of the requests but it is even more complicated.
UPDATE:
You can also find the info in results hovering over a gray shortcut of path on the left of the request status. A tooltip displays a full path, what in fact eliminates the need of the above solution if you only want to observe the results, but the solution can be useful if you want to make some logs containing the path.
I don't think that there is anything like that from within the application - The closet I can see is the pm.info.requestName function which references the request name that the test belongs too.
This is a basic use case but you could add this to the test name to give you a 'quick glance' and what was run against what request.
pm.test(`${pm.info.requestName} - Status code is 200`, () => {
pm.response.to.have.status(200)
})
If you take a look at Newman it might have something within the summary object that you could extract, in a script, to get the folder name but I've never done this.
Closest thing right now would be this:
I believe I included this in my logged bugs / feature requests out to their team.
pytest allows the creation of fixtures that are automatically applied to every test in a test suite (via the autouse keyword argument). This is useful for implementing setup and teardown actions that affect every test case. More details can be found in the pytest documentation.
In theory, the same infrastructure would also be very useful for verifying post-conditions that are expected to exist after each test runs. For example, maybe a log file is created every time a test runs, and I want to make sure it exists when the test ends.
Don't get hung up on the details, but I hope you get the basic idea. The point is that it would be tedious and repetitive to add this code to each test function, especially when autouse fixtures already provide infrastructure for applying this action to every test. Furthermore, fixtures can be packaged into plugins, so my check could be used by other packages.
The problem is that it doesn't seem to be possible to cause a test failure from a fixture. Consider the following example:
#pytest.fixture(autouse=True)
def check_log_file():
# Yielding here runs the test itself
yield
# Now check whether the log file exists (as expected)
if not log_file_exists():
pytest.fail("Log file could not be found")
In the case where the log file does not exist, I don't get a test failure. Instead, I get a pytest error. If there are 10 tests in my test suite, and all of them pass, but 5 of them are missing a log file, I will get 10 passes and 5 errors. My goal is to get 5 passes and 5 failures.
So the first question is: is this possible? Am I just missing something? This answer suggests to me that it is probably not possible. If that's the case, the second question is: is there another way? If the answer to that question is also "no": why not? Is it a fundamental limitation of pytest infrastructure? If not, then are there any plans to support this kind of functionality?
In pytest, a yield-ing fixture has the first half of its definition executed during setup and the latter half executed during teardown. Further, setup and teardown aren't considered part of any individual test and thus don't contribute to its failure. This is why you see your exception reported as an additional error rather than a test failure.
On a philosophical note, as (cleverly) convenient as your attempted approach might be, I would argue that it violates the spirit of test setup and teardown and thus even if you could do it, you shouldn't. The setup and teardown stages exist to support the execution of the test—not to supplement its assertions of system behavior. If the behavior is important enough to assert, the assertions are important enough to reside in the body of one or more dedicated tests.
If you're simply trying to minimize the duplication of code, I'd recommend encapsulating the assertions in a helper method, e.g., assert_log_file_cleaned_up(), which can be called from the body of the appropriate tests. This will allow the test bodies to retain their descriptive power as specifications of system behavior.
AFAIK it isn't possible to tell pytest to treat errors in particular fixture as test failures.
I also have a case where I would like to use fixture to minimize test code duplication but in your case pytest-dependency may be a way to go.
Moreover, test dependencies aren't bad for non-unit tests and be careful with autouse because it makes tests harder to read and debug. Explicit fixtures in test function header give you at least some directions to find executed code.
I prefer using context managers for this purpose:
from contextlib import contextmanager
#contextmanager
def directory_that_must_be_clean_after_use():
directory = set()
yield directory
assert not directory
def test_foo():
with directory_that_must_be_clean_after_use() as directory:
directory.add("file")
If you absoulutely can't afford to add this one line for every test, it's easy enough to write this as a plugin.
Put this in your conftest.py:
import pytest
directory = set()
# register the marker so that pytest doesn't warn you about unknown markers
def pytest_configure(config):
config.addinivalue_line("markers",
"directory_must_be_clean_after_test: the name says it all")
# this is going to be run on every test
#pytest.hookimpl(hookwrapper=True)
def pytest_runtest_call(item):
directory.clear()
yield
if item.get_closest_marker("directory_must_be_clean_after_test"):
assert not directory
And add the according marker to your tests:
# test.py
import pytest
from conftest import directory
def test_foo():
directory.add("foo file")
#pytest.mark.directory_must_be_clean_after_test
def test_bar():
directory.add("bar file")
Running this will give you:
fail.py::test_foo PASSED
fail.py::test_bar FAILED
...
> assert not directory
E AssertionError: assert not {'bar file'}
conftest.py:13: AssertionError
You don't have to use markers, of course, but these allow controlling the scope of the plugin. You can have the markers per-class or per-module as well.
Can anyone guide me towards the right direction as to where I should place a script solely for loading data into ndb. As I wish to upload all data into the gae ndb so that the application could perform query on it.
Right now, the loading of data is in my application. I wish to placed it separately from the main application.
Should it be edited in the yaml file?
EDITED
This is a snippet of the entity and the handler to upload the data into GAE ndb.
I wish to placed this chunk of code separately from my main application .py. Reason being the uploading of this data won't be done frequently and to keep the codes in the main application "cleaner".
class TagTrend_refine(ndb.Model):
tag = ndb.StringProperty()
trendData = ndb.BlobProperty(compressed=True)
class MigrateData(webapp2.RequestHandler):
def get(self):
listOfEntities = []
f = open("tagTrend_refine.txt")
lines = f.readlines()
f.close()
for line in lines:
temp = line.strip().split("\t")
data = TagTrend_refine(
tag = temp[0],
trendData = temp[1]
)
listOfEntities.append(data)
ndb.put_multi(listOfEntities)
For example if I placed the above code in a file called dataLoader.py, where should I call this script to invoke?
In app.yaml alongside my main application(knowledgeGraph.application)?
- url: /.*
script: knowledgeGraph.application
You don't show us the application object (no doubt a WSGI app) in your knowledge.py module, so I can't know what URL you want to serve with the MigrateData handler -- I'll just guess it's something like /migratedata.
So the class TagTrend_refine should be in a separate file (usually called models.py) so that both your dataloader.py, and your knowledge.py, can import models to access it (and models.py will need its own import of ndb of course). (Then of course access to the entity class will be as models.TagTrend_refine -- very basic Python).
Next, you'll complete dataloader.py by defining a WSGI app, e.g, at end of file,
app = webapp2.WSGIApplication(routes=[('/migratedata', MigrateData)])
(of course this means this module will need to import webapp2 as well -- can I take for granted a knowledge of super-elementary Python?).
In app.yaml, as the first URL, before that /.*, you'll have:
url: /migratedata
script: dataloader.app
Given all this, when you visit '/migratedata', your handler will read the "tagTrend_refine.txt" file that you uploaded together with your .py, .yaml, and so on, files in your overall GAE application, and unconditionally create one entity per line of that file (assuming you fix the multiple indentation problems in your code as displayed above, but, again, this is just super-elementary Python -- presumably you've used both tabs and spaces and they show up OK in your editor, but not here on SO... I recommend you use strictly, only spaces, never tabs, in Python code).
However this does seem to be a peculiar task. If /migratedata gets visited twice, it will create duplicates of all entities. If you change the tagTrend_refine.txt and deploy a changed variation, then visit /migratedata... all old entities will stick around and all the new entities will join them. And so forth.
Moreover -- /migratedata is NOT idempotent (if visited more than once it does not produce the same state as running it just once) so it shouldn't be a GET (and now we're on to super-elementary HTTP for a change!-) -- it should be a POST.
In fact I suspect (but I'm really flying blind here, since you see fit to give such tiny amounts of information) that you in fact want to upload a .txt file to a POST handler and do the updates that way (perhaps avoiding duplicates...?). However, I'm no mind reader, so this is about as far as I can go.
I believe I have fully answered the question you posted (though perhaps not the one you meant but didn't express:-) and by SO's etiquette it would be nice to upvote and accept this answer, then, if needed, post another question, expressing MUCH more clearly and completely what you're trying to achieve, your current .py and .yaml (ideally with correct indentation), what they actually do and why you'd like to do something different. For POST vs GET in particular, just study When should I use GET or POST method? What's the difference between them? ...
Alex's solution will work, as long as all you data can be loaded in under 1 minute, as that's the timeout for an app engine request.
For larger data, consider calling the datastore API directly from your own computer where you have the source. It's a bit of a hassle because it's a different API; it's not ndb. But it's still a pretty simple API. Here's some code that calls the API:
https://github.com/GoogleCloudPlatform/getting-started-python/blob/master/2-structured-data/bookshelf/model_datastore.py
Again, this code can run anywhere. It doesn't need to be uploaded to app engine to run.
I want to programmatically edit python source code. Basically I want to read a .py file, generate the AST, and then write back the modified python source code (i.e. another .py file).
There are ways to parse/compile python source code using standard python modules, such as ast or compiler. However, I don't think any of them support ways to modify the source code (e.g. delete this function declaration) and then write back the modifying python source code.
UPDATE: The reason I want to do this is I'd like to write a Mutation testing library for python, mostly by deleting statements / expressions, rerunning tests and seeing what breaks.
Pythoscope does this to the test cases it automatically generates as does the 2to3 tool for python 2.6 (it converts python 2.x source into python 3.x source).
Both these tools uses the lib2to3 library which is an implementation of the python parser/compiler machinery that can preserve comments in source when it's round tripped from source -> AST -> source.
The rope project may meet your needs if you want to do more refactoring like transforms.
The ast module is your other option, and there's an older example of how to "unparse" syntax trees back into code (using the parser module). But the ast module is more useful when doing an AST transform on code that is then transformed into a code object.
The redbaron project also may be a good fit (ht Xavier Combelle)
The builtin ast module doesn't seem to have a method to convert back to source. However, the codegen module here provides a pretty printer for the ast that would enable you do do so.
eg.
import ast
import codegen
expr="""
def foo():
print("hello world")
"""
p=ast.parse(expr)
p.body[0].body = [ ast.parse("return 42").body[0] ] # Replace function body with "return 42"
print(codegen.to_source(p))
This will print:
def foo():
return 42
Note that you may lose the exact formatting and comments, as these are not preserved.
However, you may not need to. If all you require is to execute the replaced AST, you can do so simply by calling compile() on the ast, and execing the resulting code object.
Took a while, but Python 3.9 has this:
https://docs.python.org/3.9/whatsnew/3.9.html#ast
https://docs.python.org/3.9/library/ast.html#ast.unparse
ast.unparse(ast_obj)
Unparse an ast.AST object and generate a string with code that would produce an equivalent ast.AST object if parsed back with ast.parse().
In a different answer I suggested using the astor package, but I have since found a more up-to-date AST un-parsing package called astunparse:
>>> import ast
>>> import astunparse
>>> print(astunparse.unparse(ast.parse('def foo(x): return 2 * x')))
def foo(x):
return (2 * x)
I have tested this on Python 3.5.
You might not need to re-generate source code. That's a bit dangerous for me to say, of course, since you have not actually explained why you think you need to generate a .py file full of code; but:
If you want to generate a .py file that people will actually use, maybe so that they can fill out a form and get a useful .py file to insert into their project, then you don't want to change it into an AST and back because you'll lose all formatting (think of the blank lines that make Python so readable by grouping related sets of lines together) (ast nodes have lineno and col_offset attributes) comments. Instead, you'll probably want to use a templating engine (the Django template language, for example, is designed to make templating even text files easy) to customize the .py file, or else use Rick Copeland's MetaPython extension.
If you are trying to make a change during compilation of a module, note that you don't have to go all the way back to text; you can just compile the AST directly instead of turning it back into a .py file.
But in almost any and every case, you are probably trying to do something dynamic that a language like Python actually makes very easy, without writing new .py files! If you expand your question to let us know what you actually want to accomplish, new .py files will probably not be involved in the answer at all; I have seen hundreds of Python projects doing hundreds of real-world things, and not a single one of them needed to ever writer a .py file. So, I must admit, I'm a bit of a skeptic that you've found the first good use-case. :-)
Update: now that you've explained what you're trying to do, I'd be tempted to just operate on the AST anyway. You will want to mutate by removing, not lines of a file (which could result in half-statements that simply die with a SyntaxError), but whole statements — and what better place to do that than in the AST?
Parsing and modifying the code structure is certainly possible with the help of ast module and I will show it in an example in a moment. However, writing back the modified source code is not possible with ast module alone. There are other modules available for this job such as one here.
NOTE: Example below can be treated as an introductory tutorial on the usage of ast module but a more comprehensive guide on using ast module is available here at Green Tree snakes tutorial and official documentation on ast module.
Introduction to ast:
>>> import ast
>>> tree = ast.parse("print 'Hello Python!!'")
>>> exec(compile(tree, filename="<ast>", mode="exec"))
Hello Python!!
You can parse the python code (represented in string) by simply calling the API ast.parse(). This returns the handle to Abstract Syntax Tree (AST) structure. Interestingly you can compile back this structure and execute it as shown above.
Another very useful API is ast.dump() which dumps the whole AST in a string form. It can be used to inspect the tree structure and is very helpful in debugging. For example,
On Python 2.7:
>>> import ast
>>> tree = ast.parse("print 'Hello Python!!'")
>>> ast.dump(tree)
"Module(body=[Print(dest=None, values=[Str(s='Hello Python!!')], nl=True)])"
On Python 3.5:
>>> import ast
>>> tree = ast.parse("print ('Hello Python!!')")
>>> ast.dump(tree)
"Module(body=[Expr(value=Call(func=Name(id='print', ctx=Load()), args=[Str(s='Hello Python!!')], keywords=[]))])"
Notice the difference in syntax for print statement in Python 2.7 vs. Python 3.5 and the difference in type of AST node in respective trees.
How to modify code using ast:
Now, let's a have a look at an example of modification of python code by ast module. The main tool for modifying AST structure is ast.NodeTransformer class. Whenever one needs to modify the AST, he/she needs to subclass from it and write Node Transformation(s) accordingly.
For our example, let's try to write a simple utility which transforms the Python 2 , print statements to Python 3 function calls.
Print statement to Fun call converter utility: print2to3.py:
#!/usr/bin/env python
'''
This utility converts the python (2.7) statements to Python 3 alike function calls before running the code.
USAGE:
python print2to3.py <filename>
'''
import ast
import sys
class P2to3(ast.NodeTransformer):
def visit_Print(self, node):
new_node = ast.Expr(value=ast.Call(func=ast.Name(id='print', ctx=ast.Load()),
args=node.values,
keywords=[], starargs=None, kwargs=None))
ast.copy_location(new_node, node)
return new_node
def main(filename=None):
if not filename:
return
with open(filename, 'r') as fp:
data = fp.readlines()
data = ''.join(data)
tree = ast.parse(data)
print "Converting python 2 print statements to Python 3 function calls"
print "-" * 35
P2to3().visit(tree)
ast.fix_missing_locations(tree)
# print ast.dump(tree)
exec(compile(tree, filename="p23", mode="exec"))
if __name__ == '__main__':
if len(sys.argv) <=1:
print ("\nUSAGE:\n\t print2to3.py <filename>")
sys.exit(1)
else:
main(sys.argv[1])
This utility can be tried on small example file, such as one below, and it should work fine.
Test Input file : py2.py
class A(object):
def __init__(self):
pass
def good():
print "I am good"
main = good
if __name__ == '__main__':
print "I am in main"
main()
Please note that above transformation is only for ast tutorial purpose and in real case scenario one will have to look at all different scenarios such as print " x is %s" % ("Hello Python").
If you are looking at this in 2019, then you can use this libcst
package. It has syntax similar to ast. This works like a charm, and preserve the code structure. It's basically helpful for the project where you have to preserve comments, whitespace, newline etc.
If you don't need to care about the preserving comments, whitespace and others, then the combination of ast and astor works well.
I've created recently quite stable (core is really well tested) and extensible piece of code which generates code from ast tree: https://github.com/paluh/code-formatter .
I'm using my project as a base for a small vim plugin (which I'm using every day), so my goal is to generate really nice and readable python code.
P.S.
I've tried to extend codegen but it's architecture is based on ast.NodeVisitor interface, so formatters (visitor_ methods) are just functions. I've found this structure quite limiting and hard to optimize (in case of long and nested expressions it's easier to keep objects tree and cache some partial results - in other way you can hit exponential complexity if you want to search for best layout). BUT codegen as every piece of mitsuhiko's work (which I've read) is very well written and concise.
One of the other answers recommends codegen, which seems to have been superceded by astor. The version of astor on PyPI (version 0.5 as of this writing) seems to be a little outdated as well, so you can install the development version of astor as follows.
pip install git+https://github.com/berkerpeksag/astor.git#egg=astor
Then you can use astor.to_source to convert a Python AST to human-readable Python source code:
>>> import ast
>>> import astor
>>> print(astor.to_source(ast.parse('def foo(x): return 2 * x')))
def foo(x):
return 2 * x
I have tested this on Python 3.5.
Unfortunately none of the answers above actually met both of these conditions
Preserve the syntactical integrity for the surrounding source code (e.g keeping comments, other sorts of formatting for the rest of the code)
Actually use AST (not CST).
I've recently written a small toolkit to do pure AST based refactorings, called refactor. For example if you want to replace all placeholders with 42, you can simply write a rule like this;
class Replace(Rule):
def match(self, node):
assert isinstance(node, ast.Name)
assert node.id == 'placeholder'
replacement = ast.Constant(42)
return ReplacementAction(node, replacement)
And it will find all acceptable nodes, replace them with the new nodes and generate the final form;
--- test_file.py
+++ test_file.py
## -1,11 +1,11 ##
def main():
- print(placeholder * 3 + 2)
- print(2 + placeholder + 3)
+ print(42 * 3 + 2)
+ print(2 + 42 + 3)
# some commments
- placeholder # maybe other comments
+ 42 # maybe other comments
if something:
other_thing
- print(placeholder)
+ print(42)
if __name__ == "__main__":
main()
We had a similar need, which wasn't solved by other answers here. So we created a library for this, ASTTokens, which takes an AST tree produced with the ast or astroid modules, and marks it with the ranges of text in the original source code.
It doesn't do modifications of code directly, but that's not hard to add on top, since it does tell you the range of text you need to modify.
For example, this wraps a function call in WRAP(...), preserving comments and everything else:
example = """
def foo(): # Test
'''My func'''
log("hello world") # Print
"""
import ast, asttokens
atok = asttokens.ASTTokens(example, parse=True)
call = next(n for n in ast.walk(atok.tree) if isinstance(n, ast.Call))
start, end = atok.get_text_range(call)
print(atok.text[:start] + ('WRAP(%s)' % atok.text[start:end]) + atok.text[end:])
Produces:
def foo(): # Test
'''My func'''
WRAP(log("hello world")) # Print
Hope this helps!
A Program Transformation System is a tool that parses source text, builds ASTs, allows you to modify them using source-to-source transformations ("if you see this pattern, replace it by that pattern"). Such tools are ideal for doing mutation of existing source codes, which are just "if you see this pattern, replace by a pattern variant".
Of course, you need a program transformation engine that can parse the language of interest to you, and still do the pattern-directed transformations. Our DMS Software Reengineering Toolkit is a system that can do that, and handles Python, and a variety of other languages.
See this SO answer for an example of a DMS-parsed AST for Python capturing comments accurately. DMS can make changes to the AST, and regenerate valid text, including the comments. You can ask it to prettyprint the AST, using its own formatting conventions (you can changes these), or do "fidelity printing", which uses the original line and column information to maximally preserve the original layout (some change in layout where new code is inserted is unavoidable).
To implement a "mutation" rule for Python with DMS, you could write the following:
rule mutate_addition(s:sum, p:product):sum->sum =
" \s + \p " -> " \s - \p"
if mutate_this_place(s);
This rule replace "+" with "-" in a syntactically correct way; it operates on the AST and thus won't touch strings or comments that happen to look right. The extra condition on "mutate_this_place" is to let you control how often this occurs; you don't want to mutate every place in the program.
You'd obviously want a bunch more rules like this that detect various code structures, and replace them by the mutated versions. DMS is happy to apply a set of rules. The mutated AST is then prettyprinted.
I used to use baron for this, but have now switched to parso because it's up to date with modern python. It works great.
I also needed this for a mutation tester. It's really quite simple to make one with parso, check out my code at https://github.com/boxed/mutmut
I am working on a django project that, along with having a database for its models and relations, writes to a log directory called activity_logs outside of the project directory to keep track of formatted user activities, one file for each user. This is an alternative file-structure-based solution to having a database table carry this information along, because this offloads some storage from the DB and is relatively easy to format and express such activities. Perhaps some of you may recommend storing this kind of data in the database, which is fine, but I still believe there is question from all of this that I need help answering.
This django project has multiple apps that have an extensive test suite, one for each app. Additionally, there is a logging.py file that encapsulates the logging functionality (writing/reading activities to/from log files), and so both the test cases within the test suites as well as the view functions (and various other utility functions) all utilize these logging functions in order to store these user activities and retrieve them based on model relationships to emulate a user notification system. Since the logging module takes care of this logging, it needs to know where to write to, and so we have a directory structure called activity_logs to which it writes user log files, creating one for a new user and deleting one for a user removed from the database. One of the newest changes we would like to make in this project is to create a separate logging directory for testing this logging functionality, something like test_activity_logs, so that it would never be confused when writing to the test directory for test users or the regular activity log directory for real users.
My problem is this: at runtime, how can I tell the system, at whichever startpoint of execution (whether it be from a view function call through the django test Client object, a test case, an actual HTTPrequest made via a URL, etc.), when to look inside the activity_logs or the test_activity_logs directory? It solely depends on whether I am generating new information for a real user or a test user, but a User is a User in our system, and I'm facing some trouble trying to tell these functions that call some logging functions to write to the test log directory vs. the regular one. For example, one approach I am trying is to pass a keyword argument (kwarg) to the logging functions so that they can be made aware of which directory to read/write to/from, like so:
self.assertTrue(activity_has_been_logged(ACTIVITY_ACCOUNT_CREATED, user.get_profile(), use_test_activity_log_directory=True) == True)
the kwarg called use_test_activity_log_directory=True will tell the logging function called activity_has_been_logged to read the test activity log directory. Unfortunately, apart from being a little inflexible (but tolerable), this doesn't solve the situation where the django test client object sends a GET or POST request via a URL to a view function that writes activities to log files:
response = client.post(propose_match_url, post) #Can't write to test_log_directory if by default it writes to regular directory!
How do I let the client pass on this kwarg to those view functions? I think that it should totally be possible to do this, but I'm not sure if fiddling with these kwargs is the best way, or maybe create a global variable in the project settings file, but maybe that might cause some trouble with race conditions with a shared mutable variable.
Your help would be great. Thanks in advance!
So I just solved this problem. The logging file hosting all logging functionality is really the only place that needs to know where to look (either test_activity_logs or activity_logs), since all other components will invoke functions from the logging module to write/read to/from these directories. I gave an additional field to the model class of the UserProfile class called is_test that is a boolean field to determine whether to look in the test_activity_logs if is_test=True, or activity_logs if is_test=False. That way, the logging module needs only to check the input parameter of type UserProfile and its new field to determine where to perform its logging functionalities. Problem solved!
Check out daemontools if you're on a *nix box or launchd on OS X. Both can make sure your Django instance stays running in whatever mode you prefer (daemontools has a few more options for that) and can isolate a directory for logging stdout/stderr.
You can set environment variables for each instance to help other log files and temporary files know where to be created, which you then get from os.environment or simply use the current working directory as a base if using daemontools.
The directory is automatically created for you using daemontools.