Jump to content

The ultimate community for Ruby on Rails developers.


Photo

Handling large file uploads

uploads large files uploading upload large file files forms

  • Please log in to reply
6 replies to this topic

#1 nimbus510

nimbus510

    Passenger

  • Members
  • 3 posts

Posted 20 August 2013 - 10:24 PM

I have a webapp that allows users to upload video files.  These files vary in size, but some may be as large as 4GB.

 

The problem is that when a user submits a create/edit form that posts a large video file, there is an absurd amount of server-side processing that seems to take place before a response is given to the client.  Sometimes this results in page timeouts.

 

Debugging this further, the majority of the time spent in limbo occurs AFTER the upload has complete, but BEFORE any code in the controller gets executed.  I can see that the upload reaches 100% (either through the browser status bar or subscribing to XHR onprogress in my JS), and then absolutely nothing happens in my server logs for several minutes.  Then, finally, my before hooks run, my action runs, etc.

 

So can anyone tell me what the heck is happening during this time?  I'm assuming some sort of file copy, maybe from one temp directory to another, before I actually have access to the posted file in order to move it to its final destination?  

 

I would think this would be a very common problem for anyone trying to handle large file uploads, but so far have come up with no answers.  

 

Once the code in my action actually runs, then I can handle queueing the final file move (final destination is on NAS) by using delayed_job or something similar, but I have no clue what to do about the period of time spent in limbo.

 

Ideally it would be awesome if as the upload progresses, the file chunks received are sent straight over to their final destination on the NAS to avoid repetitive file copies, but I have absolutely no idea how to possibly make that happen or if that level of control is even possible through rails.



#2 james

james

    Guard

  • Moderators
  • 221 posts
  • LocationLeeds, U.K.

Posted 21 August 2013 - 12:00 AM

All file uploading and processing should be handled as a background task.

You should not perform blocking operations like this.

 

There is a superb Railscast that describes exactly how to do what you are attempting in a number of different ways including using jQuery to upload multiple images (I know you are uploading videos but it's all just files) in the background with progress bars directly to amazon S3

 

The jQuery project can be found here

https://github.com/r...uery-fileupload

Have a good look at the source code, play with the app and see if it meets your needs.

 

The Railscast is a subscription cast. If you want to take a look then here's the link http://railscasts.co...ng-to-amazon-s3


  • Kelli Shaver likes this

Programming is just about problem solving!


#3 nimbus510

nimbus510

    Passenger

  • Members
  • 3 posts

Posted 21 August 2013 - 05:35 AM

I have looked at jquery-file-uploader before but it's not really relevant in this situation since that's all really just client side logic.  Nothing is blocked server-side until AFTER the upload takes place.  

 

After the upload takes place, I understand I can use something like delayed_job to handle copying the file to its final destination asynchronously.  

 

The issue is that there is a period of time AFTER the upload completes and BEFORE my action runs that seems to be where the bottleneck is, and I have no idea what is happening during this period or how to account for it.



#4 james

james

    Guard

  • Moderators
  • 221 posts
  • LocationLeeds, U.K.

Posted 21 August 2013 - 09:57 AM

The delay will most likely be the physical writing of the file to the disk (flushing out the cache etc...).

 

The railscast I pointed you to shows 2 different ways of uploading files directly to an asset host and has no bottle neck or clocking operations and is indeed about how to avoid exactly the situation you are finding yourself in. Firstly with carrierwave and sidekiq, The second way with sidekiq and jquery-file-uploader.

 

It's not really something I have ever had to deal with so I'm not in a position to really share any code with you. Perhaps someone else in here will have better/more appropriate ideas


Programming is just about problem solving!


#5 nimbus510

nimbus510

    Passenger

  • Members
  • 3 posts

Posted 21 August 2013 - 05:52 PM

The delay certainly has to do with physical writing of the file to disk, but not the final file write.  I suspect it is how apache is handling the request before handing it off to rails.

 

Jquery-file-uploader only helps with the upload BEFORE the period of time I'm concerned about.  Sidekiq only helps AFTER the period of time I'm talking about.

 

I'm now looking into modporter, but documentation on it is pretty much limited to their small readme on github.  It looks like they had a website at some point, but it is now down and the only info I can find on it consists of blog posts linking to the no longer present website. 



#6 james

james

    Guard

  • Moderators
  • 221 posts
  • LocationLeeds, U.K.

Posted 21 August 2013 - 05:57 PM

The delay certainly has to do with physical writing of the file to disk, but not the final file write.  I suspect it is how apache is handling the request before handing it off to rails.

 

 

That's interedting. If you are using apache can I recommend you switch to nginx with either passenger , unicorn or rainbows. You should see a massive improvement. Worth experimenting with in a sandbox I think.

 

Regarding modporter, I would be reluctant to use something that seems totally un-maintained.


Programming is just about problem solving!


#7 james

james

    Guard

  • Moderators
  • 221 posts
  • LocationLeeds, U.K.

Posted 22 August 2013 - 02:15 AM

Sorry to go off on a tangent here, but is nginx or any of the other options you mentioned better than apache performance wise? All of my Rails applications are hosted on Apache, and the two technologies don't really seem fit for one-another. Not to mention managing vhosts on apache is a real pain in the ass.

 

nginx and Unicorn are the best combination that I have found so far, yet to try thin or rainbows.

Passenger is too heavy, Apache, well, I haven't used apache with a rails app since the early days of Rails 2.

You might find this usefull http://railscasts.co...3-nginx-unicorn


Programming is just about problem solving!






Also tagged with one or more of these keywords: uploads, large files, uploading, upload, large file, files, forms

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users