CarrierWave Backgrounder process leaves denied access to s3 image

CarrierWave Backgrounder process leaves denied access to s3 image - ruby-on-rails-4

My current Sidekiq setup with the following is processing the image, but nothing is being done in the background. The page takes 10 seconds to load and then the new image is shown on the redirected page with the following path.
https://d9vwx0ll7rhpx.cloudfront.net/development/photo/image/product_thumb/product_thumb_0046d4ca-ca8d-4c02-b8cd-da0255c5736e.jpg
I would like for this process to be done in the background.
def create
#photo = Photo.new(photo_params)
if #photo.save
UploadsWorker.perform_async(#photo.id, photo_params)
flash[:notice] = "Your new photograph is being processed."
redirect_to photographer_listings_path(#photog)
else
flash[:notice] = "Check the fields marked with an orange flag."
render 'new'
end
end
class UploadsWorker
include Sidekiq::Worker
def perform(photo_id, photo_params)
photo = Photo.find(photo_id)
photo.image = photo_params["image"]
photo.save
end
end
I instead tried to use the CarrierWave Backgrounder gem, but found that the version processing doesnt take place. The code runs but I am left with no image on the page redirect after a new record saves.
When looking at the Sidekiq web interface, I can see that the job is being run and then completes.
However I am left with an image ( within the same directory) that has denied access.
Strange that occurs seeing the url shown at the top when not using the gem was valid.
https://d9vwx0ll7rhpx.cloudfront.net/development/photo/image/product_thumb/product_thumb_a9cf111f-c93a-4ec3-b1be-293321147000.jpg

I had to use store_in_background rather than process_in_background in order for the image to be properly uploaded to s3

Related

Session variable lost after redirect for feature spec

The session object is cleared after redirect during testing.
First, my test stack. I am running a Rails 4.2.4 app with capybara 2.6.2, capybara-webkit 1.8.0, and rspec 3.3.0. My tests were running without issue until, for no apparent reason whatsoever, they weren't.
Below is my code (condensed to stay on point):
assessment_controller.rb
def create
#assessment_basic = Assessment::Basic.new(params)
if #assessment_basic.valid?
session[:most_recent_zip_code] = #assessment_basic.zip_code
if household.search.present?
update_household_search(#assessment_basic.zip_code)
end
redirect_to dashboard_path
else
render :new
end
end
private
def household
if #household
#household
elsif session[:household_id]
#household = find_household(session[:household_id])
elsif session[:most_recent_zip_code]
#household = Household.create(household_params)
session[:household_id] = #household.id
end
#household
end
As you can see, this is pretty straight-forward. I am receiving a zip code in the params. I store that zip code for later use and use it to create a household object unless one already exists, in which case I return that instance. If a household object is instantiated, I then store its id in the session, return control back to the action and redirect to the dashboard_path having two variables in session. All of this works well, and has worked well for some time now.
However, when I try to access the variables in the dashboard#index action, none of the session variables I stored are present. The feature works, which suggests that my problem is with the bits running my specs. By the way, the spec passes locally. It is when the code is moved to our CI environment that we get the error. We tried three different CI environments (Circle, Semaphore, and Travis) and they all report the same error:
#<NoMethodError: undefined method 'search' for nil:NilClass>
Which basically means, the household could not be recreated and is therefore nil. A closer look shows the reason the household could not be created from session is that the session was cleared.
Can someone help me identify the component(s) involved in ensuring session values persists during tests? Let me know if you need anything else in order to be able to help me.
Hector
it 'a client visits the referrals page with all providers minus the single stop and tax locations from the dashboard page and ' do
user_visits_the_homepage
enter_zip_code('11217', true)
dashboard_page.expect_to_be_on
dashboard_page.click_browse_local_resources
referrals_page.expect_to_be_on
end
def user_visits_the_homepage
visit '/'
expect(page).to have_content t('welcome.where')
end
def enter_zip_code(zip_code = '11217', remain_dashboard = false)
within '.welcome-form:nth-of-type(1)' do
fill_in t('welcome.where'), with: zip_code
click_on t('welcome.get_started')
end
expect(page).to have_content t('header.dashboard')
click_on t('header.your_profile') unless remain_dashboard
expect(page).to have_field t('activerecord.attributes.client_household_member.zip_code'), with: zip_code unless remain_dashboard
end

Rails image upload security

Currently, I adopt Carrierwave for users to images.
However, I hardly find a solution for image security, i.e. how to set image authorisation for the uploaded images to only let certain user in the same group to view?
After digging Facebook's implementation, I observe that they inject these params (oh,oe, __gda_) to the image url
?oh=924eb34394&oe=55E07&__gda__=1436393492fc8bf91e1aec5
Is there any similar implementation for carrierwave or paperclip?
Thanks

I worked quite a bit with this (only with Paperclip).
There is one solution that is okay, but it takes a lot of processing.
If you only want to hide your files from being looped through you can hash your Paperclip attachment, see this: https://github.com/thoughtbot/paperclip/wiki/Hashing
If you want to authorize user on every image load you can do like this:
Move your files out of your Public-folder
has_attached_file :image,
styles: { large: '1500x1500>', small: '250x250>'},
path: ':rails_root/storage/gallery/image/:style/:filename'
Use Sendfile to view your file
def show
send_file(object.image.path(:small), filename: object.image_file_name, type: "image/png",disposition: 'inline',x_sendfile: true)
end
I'm however a bit reluctant to implement this for example an image gallery, since it takes a GET-action + authorization for each image. Using the x-sendfile works with Apache to deliver the images faster.
Ref:
http://apidock.com/rails/ActionController/Streaming/send_file

I found this great solution for paperclip from https://makandracards.com/makandra/734-deliver-paperclip-attachments-to-authorized-users-only
Though a little out of date, this article details everything needed to secure not only the access to attachments, but also how to secure the files themselves. This article describes all of the steps to implement it, including Capistrano deployment!
be sure to use updated routes by changing:
map.resources :notes, :member => { :attachment => :get }
to:
resources :notes, only: [] do
member do
get :attachment
end
end
also I updated the link from:
link_to 'Download attachment', [:attachment, #note]
to:
link_to 'Download Attachment', attachment_note_path( #note.id )
also see Paperclip changing URL/Path for configuring the url.

Carrierwave stores uploads in /public by default, where all content is simply served as static content. If you need to control access to this uploads I'd start by configuring a different storage path
class TestUploader < CarrierWave::Uploader::Base
def store_dir
Rails.root.join('uploads', relative_path).to_s
end
def serving_path # Use this method to get the serving path of the upload
File.join '/uploads', relative_path
end
private
def relative_path
File.join model.class.model_name.plural, model.id.to_s
end
end
Since CarrierWave relies on public asset serving to serve uploads, you'll have to implement your own file serving method. This is silly example of how to do that with Rails
class Test < ApplicationRecord
mount_uploader :file, TestUploader
end
Rails.application.routes.draw do
get '/uploads/:model/:id', to: 'uploads#get'
end
class UploadsController < ApplicationController
def get
# ... autorization logic
model = params.fetch(:model).singularize.camelcase.safe_constantize
return head 400 unless model.present?
send_file model.find(params.fetch(:id)).file.path
end
end

Manage multiple uploads with Flask session

I have a following situation. I created a simple backend in Flask that handles file uploads. With files received, Flask does something (uploads them), and returns the data to the caller. There are two scenarios with the app, to upload one image and multiple images. When uploading one image, I can simply get the response and voila, I'm all set.
However, I am stuck on handling multiple file uploads. I can use the same handler for the actual file upload, but the issue is that all of those files need to be stored into a list or something, then processed, and after doing that, a single link (album) containing all those images, needs to be delivered.
Here is my upload handling code:
#app.route('/uploadv3', methods=['POST'])
def upload():
if request.method == 'POST':
data_file = request.files["file"]
file_name = data_file.filename
path_to_save_to = os.path.join(app.config['UPLOAD_FOLDER'], file_name)
data_file.save(path_to_save_to)
file_url = upload_image_to_image_host(path_to_save_to)
return file_url
I was experimenting with session in flask, but I dont know can I create a list of items under one key, like session['links'], and then get all those, and clear it after doing the work. Or is there some other simpler solution?
I assume that I could probably do this via key for each image, like session["link1"], and so on, but that would impose a limit on the images (depending on how much of those I create), would make the code very ugly, make the iteration over each in order to generate a list that is passed to an album building method problematic, and session clearing would be tedious.
Some code that I wrote for getting the actual link at the end and clearing the session follows (this assume that session['link'] has a list of urls, which I can't really achieve with my knowledge of session management in Flask:
def create_album(images):
session.pop('link', None)
new_album = im.create_album(images)
return new_album.link
#app.route('/get_album_link')
def get_album_link():
return create_album(session['link'])
Thanks in advance for your time!

You can assign anything to a session including individual value or list/dictionary etc. If you know the links, you can store them in the session as follows:
session['links'] = ['link1','link2'...and so on]
This way, you have a list of all the links. You can now access a link by:
if 'links' in session:
for link in session['links']:
print link
Once you are done with them, you can clear the session as:
if 'links' in session:
del session['links']

To clarify what I have done to make this work. At the end, it appeared that the uploading images and adding them to the album anonymously had to be done "reversely", so not adding images to an album object, but uploading an image object to an album id.
I made a method that gets the album link and puts it in the session:
#app.route('/get_album_link')
def get_album_link():
im = pyimgur.Imgur(CLIENT_ID)
new_album = im.create_album()
session.clear()
session['album'] = new_album.deletehash
session['album_link'] = new_album.link
return new_album.link
Later on, when handling uploads, I just add the image to the album and voila, all set :)
uploaded_image = im.upload_image(path_of_saved_image, album=session['album'])
file_url = uploaded_image.link
return file_url
One caveat is that the image should be added to the "deleteahash" value passed as the album value, not the album ID (which is covered by the imgur api documentation).

How To Target Photo(s) For a Page Using Graph API?

I need help understanding whether or not this is the intended behavior or if it's something that is not supported. What I want to do is upload a photo to a Facebook page post that is targeted to a specific country.
The steps I take are:
1) Upload the photo to a public album (e.g. Timeline Photos)
2) Retrieve the uploaded photo id from the album
3) Create an attachment to the post with the photo id from step #2
The problem is that I'm now uploading a photo to a public album that is not targeted to a specific country. What I would like to do is to have BOTH the photo(s) and page contents be restricted by targeting. I see a picture parameter in the api, but I believe this is just for setting the page's profile picture.
Any ideas on how this may be accomplished or whether or not this is even possible? Any code examples would also be welcomed.

After playing around with this more I found that I had to set the 'source' parameter to point to a file in my local file system (pointing to a remote file did not work). Not sure if the latter may be accomplished. Also, you have to make a call to setFileUploadSupport with true.
Pseudo code via PHP:
//Create a temporary file somewhere
$contents = ...
$tmpfname = tempnam(sys_get_temp_dir(), $prefix);
$handle = fopen($tmpfname, "w");
fwrite($handle, $contents);
fclose($handle);
//Add to params
$params['source'] = '#' . $tmpfname
$fb = FB_sdk::instance(true);
$fb->setFileUploadSupport(true);
$fb->api('[url to page]/photos', 'POST', $params);

Storing images and thumbnails on s3 in django

I'm trying to get my images thumbnailed and stored on s3 using django-storages, boto, and sorl-thumbnail. I have it working, but it's very slow, even with small images. I don't mind it being slow when I save the form and upload the images to s3, but I'd like it to display the image quickly after that.
The answer to this SO question explains that the thumbnail won't be created until first access, but that you can use get_thumbnail() to create it beforehand.
Django + S3 (boto) + Sorl Thumbnail: Suggestions for optimisation
I'm doing that, and now it seems that all entries into the thumbnail_kvstore table are created when uploading the image, rather than when it is displayed.
The problem is that the page displaying the image is still really slow. Looking at the logging panel in the debug toolbar, it looks like there is still lots of communication with s3. It seems like after the image and thumbnails are uploaded and cached, page should render quickly without communicating with s3.
What am I doing wrong? Thanks!
Update: weak hack seems to have gotten it working, but I'd love to know how to do this properly:
https://github.com/asciitaxi/sorl-thumbnail/commit/545cce3f5e719a91dd9cc21d78bb973b2211bbbf
Update: more information for #sorl
I'm working with 2 views:
ADD VIEW: In this view I submit the form to create the model with the image in it. The image is uploaded to s3. In a post_save signal, I call get_thumbnail() to generate the thumbnail before it's needed:
im = get_thumbnail(instance.image, '360x360')
DISPLAY VIEW: In this view I display the thumbnail generated in the add view:
{% thumbnail object.image "360x360" as im %}
<img src="{{ im.url }}" width="{{ im.width }}" height="{{ im.height }}">
{% endthumbnail %}
Without the patch:
ADD VIEW: creates 3 entries in the kvstore table, accesses the cache 10 times (6 sets, 4 gets), logging tab of debug toolbar says "establishing HTTP connection" 12 times
DISPLAY VIEW: still just 3 entries in the kvstore table, just 1 get from cache, but debug toolbar says "establishing HTTP connection" 3 times still
With only the change on line 122:
ADD VIEW: same as above, except the logging only says "establishing HTTP connection" 2 times
DISPLAY VIEW: same as above, except the logging only says "establishing HTTP connection" 1 time
Also adding the change on line 118:
ADD VIEW: same as above, but now we are down to 2 "establishing HTTP connection" messages
DISPLAY VIEW: same as above, with no logging messages at all
UPDATE: It looks like storage._setup() is called twice, and storage.url() is called once. Based on the timing, I'd say each one makes connections to s3:
1304711315.4
_setup
1304711317.84
1304711317.84
_setup
1304711320.3
1304711320.39
_url
1304711323.66
This seems to be reflected by the boto logging, which says "establishing HTTP connection" 3 times.

As the author of sorl thumbnail I am really interested in solving this if it is not working as I intended. If the key value sotre is populated it will currently store: name, storage and size. I have made the assumption that the url is based on the name and thus should not cause any storage calls. Looking at django storages, https://github.com/e-loue/django-storages/blob/master/storages/backends/s3boto.py#L214 it seems like a safe assumption to make. In your patch you have patched the read method for some reason. When creating a thumbnail a ImageFile instance is fetched from cache (if not create it) then you can of course call read which will read the file, but the intended use is .url which calls url on the storage with the cached name which inturn should be a non storage access op. Could you try to isolate your problem to exacly where in your code this storage access happends?
Also make sure you have THUMBNAIL_DEBUG on and that you have the key value store properly set up.

I'm not sure if you problem is the same as mine, but I found that accessing the width or height property of a normal Django ImageField would read the file from the storage backend, load it into PIL, and return the dimensions from there. This is especially costly with a remote backend like we're using, and we have very media-heavy pages.
https://code.djangoproject.com/ticket/8307 was opened to address this but the Django devs closed as wontfix because they want the width and height properties to always return the true values. So I just monkeypatch _get_image_dimensions() to use those fields, which does prevent a large number of the boto messages and improves my page-load times.
Below is my code modified from the patch attached to that ticket. I stuck this in a place which gets executed early, such as a models.py.
from django.core.files.images import ImageFile, get_image_dimensions
def _get_image_dimensions(self):
from numbers import Number
if not hasattr(self, '_dimensions_cache'):
close = self.closed
if self.field.width_field and self.field.height_field:
width = getattr(self.instance, self.field.width_field)
height = getattr(self.instance, self.field.height_field)
#check if the fields have proper values
if isinstance(width, Number) and isinstance(height, Number):
self._dimensions_cache = (width, height)
else:
self.open()
self._dimensions_cache = get_image_dimensions(self, close=close)
else:
self.open()
self._dimensions_cache = get_image_dimensions(self, close=close)
return self._dimensions_cache
ImageFile._get_image_dimensions = _get_image_dimensions

After looking at the #shadfc django ticket, I reimplemented the monkeypatch as follows:
from django.core.files.images import ImageFile
def _get_image_dimensions(self):
if not hasattr(self, '_dimensions_cache'):
if getattr(self.storage, 'IGNORE_IMAGE_DIMENSIONS', False):
self._dimensions_cache = (0, 0)
else:
close = self.closed
self.open()
self._dimensions_cache = get_image_dimensions(self, close=close)
return self._dimensions_cache
ImageFile._get_image_dimensions = _get_image_dimensions
To use it, just add a IGNORE_IMAGE_DIMENSIONS = True to your storage class and it will not be touched to get image dimensions. Likely:
from storages.backends.s3boto import S3BotoStorage
S3BotoStorage.IGNORE_IMAGE_DIMENSIONS = True
I still need to investigate where the numbers are used, to know if simple returning (0, 0) can lead to any problem, but no bug raised for now.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

CarrierWave Backgrounder process leaves denied access to s3 image - ruby-on-rails-4

I had to use store_in_background rather than process_in_background in order for the image to be properly uploaded to s3

Related

Session variable lost after redirect for feature spec

Rails image upload security

Manage multiple uploads with Flask session

How To Target Photo(s) For a Page Using Graph API?

Storing images and thumbnails on s3 in django

Categories

Resources