Skip to content

Docs

Overview

FileSlide Streamer is an Open Source drop-in service for downloading multiple files as a single zip stream.

You can read about the motivation in our Medium post.

FileSlide is used by Whitebrick No Code DB to provide flexible file sharing from a spreadsheet-like interface.

API

The client makes a POST request to the endpoint containing the list of URIs to be zipped. POST is used instead of GET to accommodate long lists of URIs in a single request. The list of URIs is saved against a random UUID and the client is then redirected to a GET request that includes the UUID. FileSlide fetches the files from the URIs in parallel, zips and streams the download to the client.

sequenceDiagram
  %%{init:{'sequence':{
    'mirrorActors': false,
    'messageFontSize': '14px'
  }}}%%
  participant Client
  participant FileSlide
  participant File Server A
  participant File Server B
  Client->>FileSlide: POST list of file URIs
  Note right of Client: https://...file1.pdf<br/>https://...file2.mov
  FileSlide->>Client: Save uuid and redirect
  Client->>FileSlide: GET uuid
  FileSlide->>File Server A: GET file1.pdf
  FileSlide->>File Server B: GET file2.mov
  File Server A-->>FileSlide: zip(file1.pdf, file2.mov)
  FileSlide-->>Client: download.zip

Initial Request Encoding

Parameters can be sent as either:

  1. form-urlencoded with multiple fs_uri_list[] parameters for the URI list
  2. form-urlencoded with a single fs_uri_list parameter containing a json encoded array for the URI list; or
  3. json encoded body with a single fs_uri_list key with a json encoded array for the URI list.
  <form action="https://stream.fileslide.io/download" method="post">
    <input type="hidden" name="fs_file_name" value="demo_download.zip" />
    <input type="hidden" name="fs_response_format" value="redirect" />
    <input type="hidden" name="fs_error_redirect_uri" value="https://example.com/landing/fileslide-error" />
    <!-- For authentication, either forward all headers sent to FileSlide on to example.com -->
    <input type="hidden" name="fs_forward_all_headers" value="true" />
    <!-- Or tell FileSlide to add a specific header -->
    <input type="hidden" name="fs_add_header-Authorization" value="ABC123 />
    <!-- Or use presigned URLs -->
    <input type="hidden" name="fs_uri_list[]" value="https://example.com/private/data/file1.pdf?presigned_key=ABC123 />
    <input type="hidden" name="fs_uri_list[]" value="https://backup.example.com/private/2020/file2.mov?presigned_key=ABC123 />
    <input type="submit" value="Download Zip of 2 Files"/>
  </form>
  <form action="https://stream.fileslide.io/download" method="post">
    <input type="hidden" name="fs_file_name" value="demo_download.zip" />
    <input type="hidden" name="fs_response_format" value="redirect" />
    <input type="hidden" name="fs_error_redirect_uri" value="https://example.com/landing/fileslide-error" />
    <!-- For authentication, either forward all headers sent to FileSlide on to example.com -->
    <input type="hidden" name="fs_forward_all_headers" value="true" />
    <!-- Or tell FileSlide to add a specific header -->
    <input type="hidden" name="fs_add_header-Authorization" value="ABC123 />
    <!-- Or use presigned URLs -->
    <input type="hidden" name="fs_uri_list" value="[
      &quot;https://example.com/private/data/file1.pdf?presigned_key=ABC123quot;,
      &quot;https://backup.example.com/private/2020/file2.mov?presigned_key=ABC123quot;
    ]" />
    <input type="submit" value="Download Zip of 2 Files"/>
  </form>
  // POST https://stream.fileslide.io/download
  // Content-Type: application/json
  {
    "fs_file_name": "demo_download.zip",
    // default fs_response_format: "json",
    // For authentication, either forward all headers sent to FileSlide on to example.com
    "fs_forward_all_headers": true,
    // Or tell FileSlide to add a specific header
    "fs_add_header-Authorization": "ABC123"
    // Or use presigned URLs
    "fs_uri_list": [
      "https://example.com/private/data/file1.pdf?presigned_key=ABC123",
      "https://backup.example.com/private/2020/file2.mov?presigned_key=ABC123"
    ]
  }

Initial POST Request

Header Description
Content-Type: application/x-www-form-urlencoded Default sent by web browser. Use this for request encoding (1) and (2) above.
Content-Type: application/json Use this for request encoding (3) above.
Parameter Description
fs_uri_list Required. An array of file URIs to be fetched and zipped (See Fetching Files).
fs_request_id Optional. Default: <generated random uuid>. Used in the Subsequent GET Request and reported in analytics for tracking.
fs_file_name Optional. Default: download.zip. The name of the zip file (used with Content-Disposition: attachment).
fs_forward_all_headers Optional. Default: false. Include all additional headers from the client in the requests for fetching files.
fs_add_header-<Header-Name> Optional. Default: <none>. Multiple allowed. Adds the header Header-Name with this value in requests for fetching files. fs_forward_all_headers takes precedence.
fs_response_format Optional. Default: "html" for form-urlencoded, "json" for json encoded. "html" or "json" or "redirect" (see below).
fs_error_redirect_uri Required for fs_response_format: "redirect" only. A full URI (beginning with http:// or https://) to redirect users in case of errors.

Initial POST Response

The response format is set with the fs_response_format parameter in the request above.

  1. html - displays a generic, FileSlide branded HTML page asking the user to contact their system administrator for help. Default for Content-Type: application/x-www-form-urlencoded.
  2. json - returns a JSON formatted error key, message and array of erroneous records if applicable. Default for Content-Type: application/json.
  3. redirect - redirects to a GET request for the URI specified in fs_error_redirect_uri and passes the corresponding error key and array of erroneous records if applicable (see below). Used for displaying custom error pages.

FileSlide Error

An error occured while trying to create your zip file. Please try again or contact your system administrator with the error message below.

Error Message

One or more of the files requested for zipping have not been permitted:
- https://example.com/private/data/file1.pdf?presigned_key=ABC123
- https://backup.example.com/private/2020/file2.mov?presigned_key=ABC123

  {
    "fs_error_key": "UNAUTHORIZED_URIS",
    "fs_error_message": "One or more of the files requested for zipping have not been permitted.",
    "fs_error_records": [
      "https://example.com/private/data/file1.pdf?presigned_key=ABC123",
      "https://backup.example.com/private/2020/file2.mov?presigned_key=ABC123"
    ]
  }
  303 "https://example.com/landing/fileslide-error?fs_error_key=UNAUTHORIZED_URIS
  &fs_error_records=https%3A%2F%2Fexample.com%2Fprivate%2Fdata%2Ffile1.pdf%3F
  presigned_key%3DABC123%3Bhttps%3A%2F%2Fbackup.example.com%2Fprivate%2F2020%2F
  file2.mov%3Fpresigned_key%3DABC123"
Code Error Key Description
303 Response to a successful request. The Location header contains subsequent GET request for download /stream/<fs_request_id>
400 MALFORMED_JSON_BODY Message: The request body must be valid JSON when using the application/json content type header
400 MALFORMED_URI_LIST Message: The fs_uri_list parameter value must either be a form-url encoded or json encoded array
400 EMPTY_URI_LIST Message: The fs_uri_list parameter value must contain at least one URI
400 DUPLICATE_URIS Message: The fs_uri_list parameter value contains duplicate URIs
400 INVALID_URIS Message: The fs_uri_list parameter value contains one or more invalid URIs
400 MISSING_REQUEST_ID Message: This URL is missing a Request ID
400 EMPTY_ERROR_REDIRECT_URI Message: The fs_error_redirect_uri parameter value is empty and is required for redirect response format
400 INVALID_ERROR_REDIRECT_URI Message: The fs_error_redirect_uri parameter value is invalid and is required for redirect response format
404 DOWNLOAD_EXPIRED Message: This download is unavailable or has expired
UNKNOWN Message: Unknown server error

Subsequent GET Request

A successful initial POST request will respond with a 303 redirect GET request to a secret URL comprising a randomly generated UUID (or custom UUID if specified in fs_request_id). This URL is available in the Location response header.

eg: GET https://stream.fileslide.io/stream/aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee

  • The UUID can be used to track the request through logs, stats and analytics.
  • After the download is successfully completed the GET request URL is expired.

Subsequent GET Response

A successful subsequenct GET request responds with 200 OK or 206 Partial Content and a binary/octet-stream data stream. If there is an error, the same format (html, json or redirect) set for the initial POST request is used for this response. The exception to this is the keys below will always respond as HTML regardless of what format was set.

  • MISSING_REQUEST_ID
  • MALFORMED_UUID
  • DOWNLOAD_EXPIRED
Code Error Key Description
200 Response to a successful request. The entire zip file is streamed from the beginning
206 Response to a successful range request. Used for download resuming
400 INVALID_RANGE_HEADER Message: _Range header must start with \Message: _bytes=\Message: __
416 MULTIPART_UNSUPPORTED Message: Multipart ranges are not supported
416 MALFORMED_RANGE Message: Range could not be parsed
416 RANGE_NOT_SATISFIABLE Message: Start of range outside zip size
403 UNAUTHORIZED_URIS Message: One or more of the files requested for zipping have not been permitted
500 UPSTREAM_ERROR Message: Could not connect to authorization server
500 CHECKSUM_ERROR Message: Error occurred during checksum computation
502 FAILED_FETCHING_URIS Message: One or more of the files requested for zipping could not be fetched
UNKNOWN Message: Unknown server error

Authentication & Authorization

URI Authorization

Because FileSlide Streamer can make requests to any public URIs, the application firstly checks that each URI requested has been allow-listed to prevent exploitation.

The fs_uri_list parameter from the initial request is sent with a HTTP POST request to an upstream authorization endpoint that responds with 200 OK and the JSON body below.

{
  "authorized": true|false
}
  • If the authorization endpoint can not be reached or returns anything other than 200 OK FileSlide Streamer returns the 500 UPSTREAM_ERROR status and key.
  • If one or more URIs are not allow-listed the authorization endpoint returns "authorized": false in the JSON body and FileSlide Streamer returns the 403 UNAUTHORIZED_URIS status and key.

User Authentication & Authorization

FileSlide Streamer is designed to be a lightweight, drop-in service and does not handle any user authentication or authorization.

There are 2 options for authentication and authorization:

  1. Presign the URIs passed in fs_uri_list.
  2. Pass credential tokens in the initial request header (using fs_forward_all_headers or fs_add_header-<Header-Name>) and implement checks on your file servers.

Fetching Files

FileSlide Streamer attempts to fetch and zip each of the files specified in the fs_uri_list parameter with the following conditions:

  • Each URI must be authorized with the upstream authorization API. If one or more URIs are not authorized, FileSlide fails early and returns the 403 UNAUTHORIZED_URIS status and key with those URIs listed in the fs_error_records parameter.
  • FileSlide then makes a HTTP GET request for the first byte of each URI to check availability. If one or more URIs are not available, FileSlide fails early and returns the 502 FAILED_FETCHING_URIS status and key with those URIs listed in the fs_error_records parameter.
  • Once all URIs are checked, FileSlide makes a GET for the full file and streams the data.
  • If the fs_forward_all_headers or fs_add_header-<Header-Name> parameters are used, FileSlide includes these headers in both the availability checking request and the full file request.

Resuming Downloads

FileSlide supports most resumable download cases by maintaing a cache of checksums and using multithreading with high concurrency to fetch file streams. Edge cases involving resuming zip downloads of large files (GBs) may not not be able to be fulfilled in time, in which case FileSlide falls-back to responding with the entire file. If you have a particular use case that requires specific resume/range request functionality please let us know and we'll be happy to find a solution for you.

Multiplex Downloads

FileSlide implements best-effort support for multiplex range requests depending on the file size, number of connections and the speed of downloading. If you have a particular use case that requires specific functionality please let us know and we'll be happy to find a solution for you.

Getting Started

FileSlide Streamer is a Ruby Sinatra application.

Running locally requires the following:

  • Ruby >= 2.6.6
  • Redis DB

The rerun gem is also recommended for development

To get started:

  • Clone the repository
  • Copy the env file mv .env.example .env
  • Update REDIS_URI in the .env file
  • bash scripts/start_dev.sh to start the server

To run the demo with remote files and the remote test Authorization and Reporting endpoints:

  • Open test/static/local/demo.html with a web browser

To run the demo with local files and the local demo Authorization and Reporting endpoints:

  • bash scripts/start_dev.sh to start the server
  • bash scripts/start_test_file_server.sh to start the local file server (serves from test/file_server/files)
  • bash scripts/start_test_upstream_server.sh to start the local Authorization and Reporting server
  • Update the .env file and set AUTHORIZATION_ENDPOINT=http://localhost:9294/authorize and REPORT_ENDPOINT=http://localhost:9294/report
  • Open test/static/local/demo.html with a web browser

Testing

To run the test suite, cd test then run bundle exec rspec from the root folder of the repo. For testing the zip streaming, the test suite covers starts up a second Puma process to deliver static files from spec/fixtures. You can extend this server with extra static files or custom endpoints as needed.