How can I send a file to the browser using Scalatra? - scalatra

I'm using the scalatra-sbt-prototype. What would I have to modify, and where , to be able to serve files from a directory on my filesystem? Say for example , I would want to serve the file first.tar.gz from /home/downloads/first.tar.gz, and have it accessible as:
http://localhost:8080/first.tar.gz

For correctness, you may also want to set the contentType so the browser doesn't try to display it as text/html if you have that set in your before filter and the disposition header with the correct name. Most browsers will infer the filename from the url but just to be sure, you can set it explicitly.
get("/first.tar.gz") {
contentType = "application/octet-stream"
val file = new java.io.File("/home/downloads/first.tar.gz")
response.setHeader("Content-Disposition", "attachment; filename=" + file.getName)
file
}
Obviously the route is very static but will do what you want.

Returning a java.io.File to the browser, from an action, seems to accomplish this.

Related

301 redirect router for Django project

I have a website building tool created in Django and I'd like to add easy user defined 301 redirects to it.
Webflow has a very easy to understand tool for 301 redirects. You add a path (not just a slug) and then define where that path should lead the user.
I'd like to do the same for the Django project I'm working on. I currently allow users to set a slug that redirects /<slug:redirect_slug>/ and they can set to go to any URL. But I'd like them to be able to add, for example, the path for an old blog post '/2018/04/12/my-favorite-thing/'
What's the best URL conf to use in Django to safely accept any path the user wants?
You can use the Path Converters that convert the path parameters into appropriate types, which also includes a converter for urls.
An example would be like the following:
path('api/<path:encoded_url>/', YourView.as_view()),
As per the docs:
Matches any non-empty string, including the path separator, '/'. This allows you to match against a complete URL path rather than just a segment of a URL path as with str.
In your view, you can get your URL like this:
encoded_url = self.kwargs.get('encoded_url')
Add a RerouteMiddleware which first checks if the request can be served by the existing URLs from the urls.py. If it cannot be served, check if the requested path is from the old -> new URLs mapping, if a match found redirect it to the new URL.
Sample piece of code to try it out.
try:
resolve(request.path_info)
except Resolver404:
# Check if the URL exists in your database/constants
# where you might have stored the old -> new URL mapping.
if request.path is valid:
new_url = # Retrieve the new URL
return redirect(new_url)
response = self.get_response(request)
return response

Adding full path to django MarkdownX images?

I have added a MarkdownxField to my django model, and it works well. I can edit it witha proper preview from the admin panel.
However, when I added images to the markdown there's a problem. My app is built in React and is served from a different port on the domain. When MarkdownX adds the image to the markdown file it does so with a relative path, so the call goes to the client port instead of the server port where it automatically saves the image.
I've looked through the settings options for MarkdownX but couldn't find something that would help out. The image is uploaded an saved well. But the client side can't get to it.
So this isn't exactly what I was going for in the question, but I solved it in the Client side in this by injecting the full path when parsing the markdownx content.
Just in case anyone gets stuck on it here's what i did:
// This util function acts as a middleware when i collect the data.
import { serverPath } from "../serverPaths";
const markdownImageInjector = (data: string): string => {
let injected = data.replace(
/!\[\]\(\/media\/markdownx\//g,
`![](${serverPath}media/markdownx/`
);
return injected;
};
export default markdownImageInjector;
and here is the action:
const getArticle = (id: string) =>
axios
.get(`${apiPath}/articles/${id}`)
.then(res => {
let data = res.data;
data.content = markdownImageInjector(data.content);
return data;
})
The possible solution (maybe the hard one too) is, provide the absolute path to the image
Ex:
![MyImage](https://my.domain.com/mediafiles/markdownx/e4692afa-9608-4fac-bd85-c28209879d0b.png)

get list of files in a sharepoint directory using python

I have a url for sharepoint directory(intranet) and need an api to return list of files in that directory given the url. how can I do that using python?
Posting in case anyone else comes across this issue of getting files from a SharePoint folder from just the folder path.
This link really helped me do this: https://github.com/vgrem/Office365-REST-Python-Client/issues/98. I found so much info about doing this for HTTP but not in Python so hopefully someone else needs more Python reference.
I am assuming you are all setup with client_id and client_secret with the Sharepoint API. If not you can use this for reference: https://learn.microsoft.com/en-us/sharepoint/dev/solution-guidance/security-apponly-azureacs
I basically wanted to grab the names/relative urls of the files within a folder and then get the most recent file in the folder and put into a dataframe.
I'm sure this isn't the "Pythonic" way to do this but it works which is good enough for me.
!pip install Office365-REST-Python-Client
from office365.runtime.auth.client_credential import ClientCredential
from office365.runtime.client_request_exception import ClientRequestException
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.files.file import File
import io
import datetime
import pandas as pd
sp_site = 'https://<org>.sharepoint.com/sites/<my_site>/'
relative_url = "/sites/<my_site/Shared Documents/<folder>/<sub_folder>"
client_credentials = ClientCredential(credentials['client_id'], credentials['client_secret'])
ctx = ClientContext(sp_site).with_credentials(client_credentials)
libraryRoot = ctx.web.get_folder_by_server_relative_path(relative_url)
ctx.load(libraryRoot)
ctx.execute_query()
#if you want to get the folders within <sub_folder>
folders = libraryRoot.folders
ctx.load(folders)
ctx.execute_query()
for myfolder in folders:
print("Folder name: {0}".format(myfolder.properties["ServerRelativeUrl"]))
#if you want to get the files in the folder
files = libraryRoot.files
ctx.load(files)
ctx.execute_query()
#create a dataframe of the important file properties for me for each file in the folder
df_files = pd.DataFrame(columns = ['Name', 'ServerRelativeUrl', 'TimeLastModified', 'ModTime'])
for myfile in files:
#use mod_time to get in better date format
mod_time = datetime.datetime.strptime(myfile.properties['TimeLastModified'], '%Y-%m-%dT%H:%M:%SZ')
#create a dict of all of the info to add into dataframe and then append to dataframe
dict = {'Name': myfile.properties['Name'], 'ServerRelativeUrl': myfile.properties['ServerRelativeUrl'], 'TimeLastModified': myfile.properties['TimeLastModified'], 'ModTime': mod_time}
df_files = df_files.append(dict, ignore_index= True )
#print statements if needed
# print("File name: {0}".format(myfile.properties["Name"]))
# print("File link: {0}".format(myfile.properties["ServerRelativeUrl"]))
# print("File last modified: {0}".format(myfile.properties["TimeLastModified"]))
#get index of the most recently modified file and the ServerRelativeUrl associated with that index
newest_index = df_files['ModTime'].idxmax()
newest_file_url = df_files.iloc[newest_index]['ServerRelativeUrl']
# Get Excel File by newest_file_url identified above
response= File.open_binary(ctx, newest_file_url)
# save data to BytesIO stream
bytes_file_obj = io.BytesIO()
bytes_file_obj.write(response.content)
bytes_file_obj.seek(0) # set file object to start
# load Excel file from BytesIO stream
df = pd.read_excel(bytes_file_obj, sheet_name='Sheet1', header= 0)
Here is another helpful link of the file properties you can view: https://learn.microsoft.com/en-us/previous-versions/office/developer/sharepoint-rest-reference/dn450841(v=office.15). Scroll down to file properties section.
Hopefully this is helpful to someone. Again, I am not a pro and most of the time I need things to be a bit more explicit and written out. Maybe others feel that way too.
You need to do 2 things here.
Get a list of files (which can be directories or simple files) in
the directory of your interest.
Loop over each item in this list of files and check if
the item is a file or a directory. For each directory do the same as
step 1 and 2.
You can find more documentation at https://learn.microsoft.com/en-us/sharepoint/dev/sp-add-ins/working-with-folders-and-files-with-rest#working-with-files-attached-to-list-items-by-using-rest
def getFilesList(directoryName):
...
return filesList
# This will tell you if the item is a file or a directory.
def isDirectory(item):
...
return true/false
Hope this helps.
I have a url for sharepoint directory
Assuming you asking about a library, you can use SharePoint's REST API and make a web service call to:
https://yourServer/sites/yourSite/_api/web/lists/getbytitle('Documents')/items?$select=Title
This will return a list of documents at: https://yourServer/sites/yourSite/Documents
See: https://msdn.microsoft.com/en-us/library/office/dn531433.aspx
You will of course need the appropriate permissions / credentials to access that library.
You can not use "server name/sites/Folder name/Subfolder name/_api/web/lists/getbytitle('Documents')/items?$select=Title" as URL in SharePoint REST API.
The URL structure should be like below considering WebSiteURL is the URL of site/subsite containing document library from which you are trying to get files and Documents is the Display name of document library:
WebSiteURL/_api/web/lists/getbytitle('Documents')/items?$select=Title
And if you want to list metadata field values you should add Field names separated by comma in $select.
Quick tip: If you are not sure about the REST API URL formation. Try pasting the URL in Chrome browser (you must be logged in to SharePoint site with appropriate permissions) and see if you get proper result as XML if you are successful then update the REST URL and run the code. This way you will save time of running your python code.

Save pdf from django-wkhtmltopdf to server (instead of returning as a response)

I'm writing a Django function that takes some user input, and generates a pdf for the user. However, the process for generating the pdf is quite intensive, and I'll get a lot of repeated requests so I'd like to store the generated pdfs on the server and check if they already exist before generating them.
The problem is that django-wkhtmltopdf (which I'm using for generation) is meant to return to the user directly, and I'm not sure how to store it on the file.
I have the following, which works for returning a pdf at /pdf:
urls.py
urlpatterns = [
url(r'^pdf$', views.createPDF.as_view(template_name='site/pdftemplate.html', filename='my_pdf.pdf'))
]
views.py
class createPDF(PDFTemplateView):
filename = 'my_pdf.pdf'
template_name = 'site/pdftemplate.html'
So that works fine to create a pdf. What I'd like is to call that view from another view and save the result. Here's what I've got so far:
#Create pdf
pdf = createPDF.as_view(template_name='site/pdftemplate.html', filename='my_pdf.pdf')
pdf = pdf(request).render()
pdfPath = os.path.join(settings.TEMP_DIR,'temp.pdf')
with open(pdfPath, 'w') as f:
f.write(pdf.content)
This creates temp.pdf and is about the size I'd expect but the file isn't valid (it renders as a single completely blank page).
Any suggestions?
Elaborating on the previous answer given: to generate a pdf file and save to disk do this anywhere in your view:
...
context = {...} # build your context
# generate response
response = PDFTemplateResponse(
request=self.request,
template=self.template_name,
filename='file.pdf',
context=context,
cmd_options={'load-error-handling': 'ignore'})
# write the rendered content to a file
with open("file.pdf", "wb") as f:
f.write(response.rendered_content)
...
I have used this code in a TemplateView class so request and template fields were set like that, you may have to set it to whatever is appropriate in your particular case.
Well, you need to take a look to the code of wkhtmltopdf, first you need to use the class PDFTemplateResponse in wkhtmltopdf.views to get access to the rendered_content property, this property get us access to the pdf file:
response = PDFTemplateResponse(
request=<your_view_request>,
template=<your_template_to_render>,
filename=<pdf_filename.pdf>,
context=<a_dcitionary_to_render>,
cmd_options={'load-error-handling': 'ignore'})
Now you could use the rendered_content property to get access to the pdf file:
mail.attach('pdf_filename.pdf', response.rendered_content, 'application/pdf')
In my case I'm using this pdf to attach to an email, you could store it.

Django upload file into specific directory that depends on the POST URI

I'd like to store uploaded files into a specific directory that depends on the URI of the POST request. Perhaps, I'd also like to rename the file to something fixed (the name of the file input for example) so I have an easy way to grep the file system, etc. and also to avoid possible security problems.
What's the preferred way to do this in Django?
Edit: I should clarify that I'd be interested in possibly doing this as a file upload handler to avoid writing a large file twice to the file system.
Edit2: I suppose one can just 'mv' the tmp file to a new location. That's a cheap operation if on the same file system.
Fixed olooney example. It is working now
#csrf_exempt
def upload_video_file(request):
folder = 'tmp_dir2/' #request.path.replace("/", "_")
uploaded_filename = request.FILES['file'].name
BASE_PATH = '/home/'
# create the folder if it doesn't exist.
try:
os.mkdir(os.path.join(BASE_PATH, folder))
except:
pass
# save the uploaded file inside that folder.
full_filename = os.path.join(BASE_PATH, folder, uploaded_filename)
fout = open(full_filename, 'wb+')
file_content = ContentFile( request.FILES['file'].read() )
try:
# Iterate through the chunks.
for chunk in file_content.chunks():
fout.write(chunk)
fout.close()
html = "<html><body>SAVED</body></html>"
return HttpResponse(html)
except:
html = "<html><body>NOT SAVED</body></html>"
return HttpResponse(html)
Django gives you total control over where (and if) you save files. See: http://docs.djangoproject.com/en/dev/topics/http/file-uploads/
The below example shows how to combine the URL and the name of the uploaded file and write the file out to disk:
def upload(request):
folder = request.path.replace("/", "_")
uploaded_filename = request.FILES['file'].name
# create the folder if it doesn't exist.
try:
os.mkdir(os.path.join(BASE_PATH, folder))
except:
pass
# save the uploaded file inside that folder.
full_filename = os.path.join(BASE_PATH, folder, uploaded_filename)
fout = open(full_filename, 'wb+')
# Iterate through the chunks.
for chunk in fout.chunks():
fout.write(chunk)
fout.close()
Edit: How to do this with a FileUploadHandler? It traced down through the code and it seems like you need to do four things to repurpose the TemporaryFileUploadHandler to save outside of FILE_UPLOAD_TEMP_DIR:
extend TemporaryUploadedFile and override init() to pass through a different directory to NamedTemporaryFile. It can use the try mkdir except for pass I showed above.
extend TemporaryFileUploadHandler and override new_file() to use the above class.
also extend init() to accept the directory where you want the folder to go.
Dynamically add the request handler, passing through a directory determined from the URL:
request.upload_handlers = [ProgressBarUploadHandler(request.path.replace('/', '_')]
While non-trivial, it's still easier than writing a handler from scratch: In particular, you won't have to write a single line of error-prone buffered reading. Steps 3 and 4 are necessary because FileUploadHandlers are not passed request information by default, I believe, so you'll have to tell it separately if you want to use the URL somehow.
I can't really recommend writing a custom FileUploadHandler for this. It's really mixing layers of responsibility. Relative to the speed of uploading a file over the internet, doing a local file copy is insignificant. And if the file's small, Django will just keep it in memory without writing it out to a temp file. I have a bad feeling that you'll get all this working and find you can't even measure the performance difference.