Importing files from other directory in python?

Importing files from other directory in python? - python-2.7

I have following directory structure:
A
|
|--B--Hello.py
|
|--C--Message.py
Now if the path of root directory A is not fixed, how can i import "Hello.py" from B to "Message.py" in C.

At first I suggest to add empty __init__.py file into every directory with python sources. It will prevent many issues with imports because this is how the packages work in Python:
In your case it this should look like this:
A
├── B
│   ├── Hello.py
│   └── __init__.py
├── C
│   ├── Message.py
│   └── __init__.py
└── __init__.py
Let's say the Hello.py contains the function foo:
def foo():
return 'bar'
and the Message.py tries to use it:
from ..B.Hello import foo
print(foo())
The first way to make it work is to let the Python interpreter to do his job and to handle package constructing:
~ $ python -m A.C.Message
Another option is to add your Hello.py file into list of known sources with the following code:
# Message.py file
import sys, os
sys.path.insert(0, os.path.abspath('..'))
from B.Hello import foo
print(foo())
In this case you can execute it with
~/A/C $ python Message.py

Related

How to properly server static files from a Flask server?

What is the proper way of serving static files (images, PDFs, Docs etc) from a flask server?
I have used the send_from_directory method before and it works fine. Here is my implementation:
#app.route('/public/assignments/<path:filename>')
def file(filename):
return send_from_directory("./public/assignments/", filename, as_attachment=True)
However if I have multiple different folders, it can get a bit hectic and repetitive because you are essentially writing the same code but for different file locations - meaning if I wanted to display files for a user instead of an assignment, I'd have to change it to /public/users/<path:filename> instead of /public/assignments/<path:filename>.
The way I thought of solving this is essentially making a /file/<path:filepath> route, where the filepath is the entire path to the destination folder + the file name and extension, instead of just the file name and extension. Then I did some formatting and separated the parent directory from the file itself and used that data when calling the send_from_directory function:
#app.route('/file/<path:filepath>', methods=["GET"])
def general_static_files(filepath):
filepath = filepath.split("/")
_dir = ""
for i, p in enumerate(filepath):
if i < len(filepath) - 1:
_dir = _dir + (p + "/")
return send_from_directory(("./" + _dir), filepath[len(filepath) - 1], as_attachment=True)
if we simulate the following request to this route:
curl http://127.0.0.1:5000/file/public/files/jobs/images/job_43_image_1.jpg
the _dir variable will hold the ./public/files/jobs/images/ value, and then filepath[len(filepath) - 1] holds the job_43_image_1.jpg value.
If i hit this route, I get a 404 - Not Found response, but all the code in the route body is being executed.
I suspect that the send_from_directory function is the reason why I'm getting a 404 - Not Found. However, I do have the image job_43_image_1.jpg stored inside the /public/files/jobs/images/ directory.
I'm afraid I don't see a lot I can do here except hope that someone has encountered the same issue/problem and found a way to fix it.
Here is the folder tree:
├── [2050] app.py
├── [2050] public
│   ├── [2050] etc
│   └── [2050] files
│   ├── [2050] jobs
│   │   ├── [2050] files
│   │   └── [2050] images
│   │   ├── [2050] job_41_image_decline_1.jpg
│   │   ├── [2050] job_41_image_decline_2554.jpg
│   │   ├── [2050] ...
│   ├── [2050] shop
│   └── [2050] videos
└── [2050] server_crash.log
Edit 1: I have set up the static_url_path. I have no reason to believe that that could be the cause of my problem.
Edit 2: Added tree

Pass these arguments when you initialise the app:
app = Flask(__name__, static_folder='public',
static_url_path='frontend_public' )
This would make the file public/blah.txt available at http://example.com/frontend_public/blah.txt.
static_folder sets the folder on the filesystem
static_url_path sets the path used within the URL
If neither of the variables are set, it defaults to 'static' for both.
Hopefully this is what you're asking.

Flask can't find applications file [duplicate]

This question already has an answer here:
Refering to a directory in a Flask app doesn't work unless the path is absolute
(1 answer)
Closed 4 years ago.
I have a standalone app that takes in an excel file and outputs a word doc. This works fine as standalone.
I have now tried to integrate it into a Flask application, but flask can't find the subfolder "templates" of my application. Here is my file structure:
my_flask_site
├── flask_app.py
├── __init__.py
├── templates
| ├── index.html
| └── report.html
├── uploads
| └── myfile.xlsx
|
└── apps
└── convert_app
├── __init__.py
├── main.py
├── report
| ├── __init__.py
| ├── data_ingest.py
| └── report_output.py
└── templates
└── output_template.docx
now I can't get the report_output.py file to find the output_template.docx file now it is in the flask application.
def run_report(file):
data = data_ingest.Incident(file)
priority_count = dict(data.df_length())
size = sum(priority_count.values())
print(priority_count)
print(size)
report = report_output.Report()
report.header()
report.priority_header(0)
i = 0
if '1' in priority_count:
for _ in range(priority_count['1']):
field = data.fields(i)
report.priority_body(field)
i += 1
report.break_page()
report.priority_header(1)
else:
report.none()
report.priority_header(1)
if '2' in priority_count:
for _ in range(priority_count['2']):
field = data.fields(i)
report.priority_body(field)
i += 1
report.break_page()
report.priority_header(2)
else:
report.none()
report.break_page()
report.priority_header(2)
if '3' in priority_count:
for _ in range(priority_count['3']):
field = data.fields(i)
report.priority_body(field)
i += 1
report.break_page()
if '4' in priority_count:
for _ in range(priority_count['4']):
field = data.fields(i)
i += 1
output = OUTPUT_FILE+f"/Platform Control OTT Daily Report {data.field[0]}.docx"
report.save(output)
print(f"Report saved to:\n\n\t {output}")
def main(file):
run_report(file)
if __name__ == "__main__":
main()
and here is the report_output.py (without the word format part):
from docx import Document
class Report(object):
def __init__(self):
self.doc = Document('./templates/pcc_template.docx')
self.p_title = ['Major Incident',
'Stability Incidents (HPI)',
'Other Incidents']
self.date = datetime.now().strftime('%d %B %Y')
def save(self, output):
self.doc.save(output)
There is more in the format_report.py file, but it is related to the function of the app. Where I am stuck is how I get the app to again see it's own template folder and the template file inside it.

I have solved my problem, after finding this post here Refering to a directory in a Flask app doesn't work unless the path is absolute.
What I take from this is that the file path has to be absolute from the Flask applications root folder, in this case "my_flask_site" is the root folder and adding the full file path solved the problem.

AWS Glue Crawler adding tables for every partition?

I have several thousand files in an S3 bucket in this form:
├── bucket
│ ├── somedata
│ │   ├── year=2016
│ │   ├── year=2017
│ │   │   ├── month=11
│ │   | │   ├── sometype-2017-11-01.parquet
│ | | | ├── sometype-2017-11-02.parquet
│ | | | ├── ...
│ │   │   ├── month=12
│ │   | │   ├── sometype-2017-12-01.parquet
│ | | | ├── sometype-2017-12-02.parquet
│ | | | ├── ...
│ │   ├── year=2018
│ │   │   ├── month=01
│ │   | │   ├── sometype-2018-01-01.parquet
│ | | | ├── sometype-2018-01-02.parquet
│ | | | ├── ...
│ ├── moredata
│ │   ├── year=2017
│ │   │   ├── month=11
│ │   | │   ├── moretype-2017-11-01.parquet
│ | | | ├── moretype-2017-11-02.parquet
│ | | | ├── ...
│ │   ├── year=...
etc
Expected behavior:
The AWS Glue Crawler creates one table for each of somedata, moredata, etc. It creates partitions for each table based on the childrens' path names.
Actual Behavior:
The AWS Glue Crawler performs the behavior above, but ALSO creates a separate table for every partition of the data, resulting in several hundred extraneous tables (and more extraneous tables which every data add + new crawl).
I see no place to be able to set something or otherwise prevent this from happening... Does anyone have advice on the best way to prevent these unnecessary tables from being created?

Adding to the excludes
**_SUCCESS
**crc
worked for me (see aws page glue/add-crawler). Double stars match the files at all folder (ie partition) depths. I had an _SUCCESS living a few levels up.
Make sure you set up logging for glue, which quickly points out permission errors etc.

Use the Create a Single Schema for Each Amazon S3 Include Path option to avoid the AWS Glue Crawler adding all these extra tables.
I had this problem and ended up with ~7k tables 😅 so wrote the following script to remove them. It requires jq.
#!/bin/sh
aws glue get-tables --region <YOUR AWS REGION> --database-name <YOUR AWS GLUE DATABASE> | jq '.TableList[] | .Name' | grep <A PATTERN THAT MATCHES YOUR TABLENAMEs> > /tmp/table-names.json
cd /tmp
mkdir table-names
cd table-names
split -l 50 ../table-names.json
for f in `ls`; cat $f | tr '\r\n' ' ' | xargs aws glue batch-delete-table --region <YOUR AWS REGION> --database-name <YOUR AWS GLUE DATABASE> --tables-to-delete;

check if you have empty folders inside. When spark writes to S3, sometimes, the _temporary folder is not deleted, which will make Glue crawler create table for each partition.

I was having the same problem.
I added *crc* as exclude pattern to the AWS Glue crawler and it worked.
Or if you crawl entire directories add */*crc*.

So, my case was a little bit different and I was having the same behaviour.
I got a data structure like this:
├── bucket
│ ├── somedata
│ │ ├── event_date=2016-01-01
│ │ ├── event_date=2016-01-02
So when I started AWS Glue Crawler instead of update the tables, this pipeline was creating a one table per date. After digging into the problem I found that someone added a column as a bug at the json file instead of id was ID. Because my data is parquet the pipeline was working well to store the data and retrieve inside the EMR. But Glue was crashing pretty bad because Glue convert everything to lowercase and probably that was the reason why it was crashing. Removing the uppercase column glue start to work like a charm.

You need to have separate crawlers for each table / file type. So create one crawler that looks at s3://bucket/somedata/ and a 2nd crawler that looks at s3://bucket/moredata/.

How to run multiple Groovy unit tests

I want to keep my groovy source files in their own directory, with the tests being in a separate directory.
I have the directory structure as follows:
.
├── build
│   └── Messenger.class
├── build.xml
├── ivy.xml
├── lib
├── src
│   └── com
│   └── myapp
│   └── Messenger.groovy
└── test
└── unit
├── AnotherTest.groovy
└── MessengerTest.groovy
I can successfully run one test by using the groovy command and specifying the class path for the units under test using -cp to point to build/ but how do I run all the tests in the directory?

Tou can run all unit test with command:
grails test-app unit:
If you have unit, integration, functional... tests you can run all tests with command:
grails test-app

I am new to groovy, but I wrote my own test runner and put it in root directory of my project. Source code:
import groovy.util.GroovyTestSuite
import junit.textui.TestRunner
import junit.framework.TestResult
import static groovy.io.FileType.FILES
public class MyTestRunner {
public static ArrayList getTestFilesPaths(String test_dir) {
// gets list of absolute test file paths
ArrayList testFilesPaths = new ArrayList();
new File(test_dir).eachFileRecurse(FILES) {
if(it.name.endsWith(".groovy")) {
testFilesPaths.add(it.absolutePath)
}
}
return testFilesPaths;
}
public static GroovyTestSuite getTestSuite(ArrayList testFilesPaths) {
// creates test suite using absolute test file paths
GroovyTestSuite suite = new GroovyTestSuite();
testFilesPaths.each {
suite.addTestSuite(suite.compile(it));
}
return suite;
}
public static void runTests(GroovyTestSuite suite) {
// runs test in test suite
TestResult result = TestRunner.run(suite);
// if tests fail return exit code non equal to 0 indicating that
// tests fail it helps if one of your build step is to test files
if (!result.wasSuccessful()) {
System.exit(1);
}
}
}
ArrayList testFilesPaths = MyTestRunner.getTestFilesPaths("tests");
GroovyTestSuite suite = MyTestRunner.getTestSuite(testFilesPaths);
MyTestRunner.runTests(suite)
if you try to use this be aware that if it fails it is most likely that getTestFilesPaths is not working properly.
My directory structure
.
├── test_runner.groovy
├── src
│ └── ...
└── tests
└── Test1.groovy
└── someDir
├── Test2.groovy
└── Test3.groovy
How to run
From the same directory where test_runner.groovy is run:
groovy test_runner.groovy

Eunit error with multiple apps

I have the following directory structure:
myapp
├── apps
│   ├── myapp
│   ├── myotherapp
│   └── myapp_common
├── deps
│   ├── cowboy
......
I run eunit using rebar as follows in the main myapp directory:
./rebar skip_deps=true eunit
It correctly runs eunit for three apps in apps/. After that it tries to run eunit in the parent myapp directory and throws the following error:
......
==> myapp (eunit)
ERROR: eunit failed while processing /home/msheikh/myapp: {'EXIT',{{badmatch,{error,{1,
"cp: missing destination file operand after `.eunit'\nTry `cp --help' for more information.\n"}}},
[{rebar_file_utils,cp_r,2,[]},
{rebar_eunit,eunit,2,[]},
{rebar_core,run_modules,4,[]},
{rebar_core,execute,4,[]},
{rebar_core,process_dir,4,[]},
{rebar_core,process_commands,2,[]},
{rebar,main,1,[]},
{escript,run,2,[{file,"escript.erl"},{line,727}]}]}}
Question: How can I fix this or prevent eunit from running for the parent myapp directory?
The rebar.config file in the main myapp directory looks like this:
{lib_dirs, ["deps", "apps"]}.
{deps, [
{lager, ".*", {git, "https://github.com/basho/lager.git", {branch, "master"}}},
{jsx, ".*", {git, "git://github.com/talentdeficit/jsx.git", {tag, "v0.9.0"}}},
{cowboy, "", {git, "git://github.com/extend/cowboy.git", {branch, "master"}}},
....
]}.
{require_otp_vsn, "R15"}.
{escript_incl_apps, [getopt]}.
{erl_opts, [
debug_info,
warn_missing_spec,
{parse_transform, lager_transform}
]}.
{eunit_opts, [verbose]}.
{validate_app_modules, false}.
{sub_dirs, [
"apps/myapp/",
"apps/myotherapp/",
"apps/myapp_common/"]}.

I have the same project structure, and it works.
Are you sure you don't have src, test, ebin folders in the top-level directory?
If not, what happens if you mkdir .eunit? (I am not suggesting to keep this, but go looking for a solution from there).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Importing files from other directory in python? - python-2.7

I have following directory structure: A | |--B--Hello.py | |--C--Message.py Now if the path of root directory A is not fixed, how can i import "Hello.py" from B to "Message.py" in C.

Related

How to properly server static files from a Flask server?

Flask can't find applications file [duplicate]

AWS Glue Crawler adding tables for every partition?

How to run multiple Groovy unit tests

Eunit error with multiple apps

Categories

Resources