Docs
Overview
FileSlide Streamer is an Open Source drop-in service for downloading multiple files as a single zip stream.
You can read about the motivation in our Medium post.
FileSlide is used by Whitebrick No Code DB to provide flexible file sharing from a spreadsheet-like interface.
API
The client makes a POST request to the endpoint containing the list of URIs to be zipped. POST is used instead of GET to accommodate long lists of URIs in a single request. The list of URIs is saved against a random UUID and the client is then redirected to a GET request that includes the UUID. FileSlide fetches the files from the URIs in parallel, zips and streams the download to the client.
sequenceDiagram
%%{init:{'sequence':{
'mirrorActors': false,
'messageFontSize': '14px'
}}}%%
participant Client
participant FileSlide
participant File Server A
participant File Server B
Client->>FileSlide: POST list of file URIs
Note right of Client: https://...file1.pdf<br/>https://...file2.mov
FileSlide->>Client: Save uuid and redirect
Client->>FileSlide: GET uuid
FileSlide->>File Server A: GET file1.pdf
FileSlide->>File Server B: GET file2.mov
File Server A-->>FileSlide: zip(file1.pdf, file2.mov)
FileSlide-->>Client: download.zip
Initial Request Encoding
Parameters can be sent as either:
form-urlencoded
with multiplefs_uri_list[]
parameters for the URI listform-urlencoded
with a singlefs_uri_list
parameter containing a json encoded array for the URI list; orjson
encoded body with a singlefs_uri_list
key with a json encoded array for the URI list.
<form action="https://stream.fileslide.io/download" method="post">
<input type="hidden" name="fs_file_name" value="demo_download.zip" />
<input type="hidden" name="fs_response_format" value="redirect" />
<input type="hidden" name="fs_error_redirect_uri" value="https://example.com/landing/fileslide-error" />
<!-- For authentication, either forward all headers sent to FileSlide on to example.com -->
<input type="hidden" name="fs_forward_all_headers" value="true" />
<!-- Or tell FileSlide to add a specific header -->
<input type="hidden" name="fs_add_header-Authorization" value="ABC123 />
<!-- Or use presigned URLs -->
<input type="hidden" name="fs_uri_list[]" value="https://example.com/private/data/file1.pdf?presigned_key=ABC123 />
<input type="hidden" name="fs_uri_list[]" value="https://backup.example.com/private/2020/file2.mov?presigned_key=ABC123 />
<input type="submit" value="Download Zip of 2 Files"/>
</form>
<form action="https://stream.fileslide.io/download" method="post">
<input type="hidden" name="fs_file_name" value="demo_download.zip" />
<input type="hidden" name="fs_response_format" value="redirect" />
<input type="hidden" name="fs_error_redirect_uri" value="https://example.com/landing/fileslide-error" />
<!-- For authentication, either forward all headers sent to FileSlide on to example.com -->
<input type="hidden" name="fs_forward_all_headers" value="true" />
<!-- Or tell FileSlide to add a specific header -->
<input type="hidden" name="fs_add_header-Authorization" value="ABC123 />
<!-- Or use presigned URLs -->
<input type="hidden" name="fs_uri_list" value="[
"https://example.com/private/data/file1.pdf?presigned_key=ABC123quot;,
"https://backup.example.com/private/2020/file2.mov?presigned_key=ABC123quot;
]" />
<input type="submit" value="Download Zip of 2 Files"/>
</form>
// POST https://stream.fileslide.io/download
// Content-Type: application/json
{
"fs_file_name": "demo_download.zip",
// default fs_response_format: "json",
// For authentication, either forward all headers sent to FileSlide on to example.com
"fs_forward_all_headers": true,
// Or tell FileSlide to add a specific header
"fs_add_header-Authorization": "ABC123"
// Or use presigned URLs
"fs_uri_list": [
"https://example.com/private/data/file1.pdf?presigned_key=ABC123",
"https://backup.example.com/private/2020/file2.mov?presigned_key=ABC123"
]
}
Initial POST Request
Header | Description |
---|---|
Content-Type: application/x-www-form-urlencoded |
Default sent by web browser. Use this for request encoding (1) and (2) above. |
Content-Type: application/json |
Use this for request encoding (3) above. |
Parameter | Description |
---|---|
fs_uri_list |
Required. An array of file URIs to be fetched and zipped (See Fetching Files). |
fs_request_id |
Optional. Default: <generated random uuid> . Used in the Subsequent GET Request and reported in analytics for tracking. |
fs_file_name |
Optional. Default: download.zip . The name of the zip file (used with Content-Disposition: attachment ). |
fs_forward_all_headers |
Optional. Default: false . Include all additional headers from the client in the requests for fetching files. |
fs_add_header-<Header-Name> |
Optional. Default: <none> . Multiple allowed. Adds the header Header-Name with this value in requests for fetching files. fs_forward_all_headers takes precedence. |
fs_response_format |
Optional. Default: "html" for form-urlencoded , "json" for json encoded. "html" or "json" or "redirect" (see below). |
fs_error_redirect_uri |
Required for fs_response_format: "redirect" only. A full URI (beginning with http:// or https:// ) to redirect users in case of errors. |
Initial POST Response
The response format is set with the fs_response_format
parameter in the request above.
html
- displays a generic, FileSlide branded HTML page asking the user to contact their system administrator for help. Default forContent-Type: application/x-www-form-urlencoded
.json
- returns a JSON formatted error key, message and array of erroneous records if applicable. Default forContent-Type: application/json
.redirect
- redirects to a GET request for the URI specified infs_error_redirect_uri
and passes the corresponding error key and array of erroneous records if applicable (see below). Used for displaying custom error pages.
FileSlide Error
An error occured while trying to create your zip file. Please try again or contact your system administrator with the error message below.
Error Message
One or more of the files requested for zipping have not been permitted:
- https://example.com/private/data/file1.pdf?presigned_key=ABC123
- https://backup.example.com/private/2020/file2.mov?presigned_key=ABC123
{
"fs_error_key": "UNAUTHORIZED_URIS",
"fs_error_message": "One or more of the files requested for zipping have not been permitted.",
"fs_error_records": [
"https://example.com/private/data/file1.pdf?presigned_key=ABC123",
"https://backup.example.com/private/2020/file2.mov?presigned_key=ABC123"
]
}
303 "https://example.com/landing/fileslide-error?fs_error_key=UNAUTHORIZED_URIS
&fs_error_records=https%3A%2F%2Fexample.com%2Fprivate%2Fdata%2Ffile1.pdf%3F
presigned_key%3DABC123%3Bhttps%3A%2F%2Fbackup.example.com%2Fprivate%2F2020%2F
file2.mov%3Fpresigned_key%3DABC123"
Code | Error Key | Description |
---|---|---|
303 |
Response to a successful request. The Location header contains subsequent GET request for download /stream/<fs_request_id> |
|
400 |
MALFORMED_JSON_BODY |
Message: The request body must be valid JSON when using the application/json content type header |
400 |
MALFORMED_URI_LIST |
Message: The fs_uri_list parameter value must either be a form-url encoded or json encoded array |
400 |
EMPTY_URI_LIST |
Message: The fs_uri_list parameter value must contain at least one URI |
400 |
DUPLICATE_URIS |
Message: The fs_uri_list parameter value contains duplicate URIs |
400 |
INVALID_URIS |
Message: The fs_uri_list parameter value contains one or more invalid URIs |
400 |
MISSING_REQUEST_ID |
Message: This URL is missing a Request ID |
400 |
EMPTY_ERROR_REDIRECT_URI |
Message: The fs_error_redirect_uri parameter value is empty and is required for redirect response format |
400 |
INVALID_ERROR_REDIRECT_URI |
Message: The fs_error_redirect_uri parameter value is invalid and is required for redirect response format |
404 |
DOWNLOAD_EXPIRED |
Message: This download is unavailable or has expired |
UNKNOWN |
Message: Unknown server error |
Subsequent GET Request
A successful initial POST request will respond with a 303 redirect GET request to a secret URL comprising a randomly
generated UUID (or custom UUID if specified in fs_request_id
). This URL is available in the Location
response header.
eg: GET https://stream.fileslide.io/stream/aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee
- The UUID can be used to track the request through logs, stats and analytics.
- After the download is successfully completed the GET request URL is expired.
Subsequent GET Response
A successful subsequenct GET request responds with 200 OK
or 206 Partial Content
and a binary/octet-stream
data stream.
If there is an error, the same format (html
, json
or redirect
) set for the initial POST request is used for this response.
The exception to this is the keys below will always respond as HTML regardless of what format was set.
MISSING_REQUEST_ID
MALFORMED_UUID
DOWNLOAD_EXPIRED
Code | Error Key | Description |
---|---|---|
200 |
Response to a successful request. The entire zip file is streamed from the beginning | |
206 |
Response to a successful range request. Used for download resuming | |
400 |
INVALID_RANGE_HEADER |
Message: _Range header must start with \Message: _bytes=\Message: __ |
416 |
MULTIPART_UNSUPPORTED |
Message: Multipart ranges are not supported |
416 |
MALFORMED_RANGE |
Message: Range could not be parsed |
416 |
RANGE_NOT_SATISFIABLE |
Message: Start of range outside zip size |
403 |
UNAUTHORIZED_URIS |
Message: One or more of the files requested for zipping have not been permitted |
500 |
UPSTREAM_ERROR |
Message: Could not connect to authorization server |
500 |
CHECKSUM_ERROR |
Message: Error occurred during checksum computation |
502 |
FAILED_FETCHING_URIS |
Message: One or more of the files requested for zipping could not be fetched |
UNKNOWN |
Message: Unknown server error |
Authentication & Authorization
URI Authorization
Because FileSlide Streamer can make requests to any public URIs, the application firstly checks that each URI requested has been allow-listed to prevent exploitation.
The fs_uri_list
parameter from the initial request is sent with a HTTP POST request to an upstream authorization endpoint that responds with 200 OK
and the JSON body below.
{
"authorized": true|false
}
- If the authorization endpoint can not be reached or returns anything other than
200 OK
FileSlide Streamer returns the500 UPSTREAM_ERROR
status and key. - If one or more URIs are not allow-listed the authorization endpoint returns
"authorized": false
in the JSON body and FileSlide Streamer returns the403 UNAUTHORIZED_URIS
status and key.
User Authentication & Authorization
FileSlide Streamer is designed to be a lightweight, drop-in service and does not handle any user authentication or authorization.
There are 2 options for authentication and authorization:
- Presign the URIs passed in
fs_uri_list
. - Pass credential tokens in the initial request header (using
fs_forward_all_headers
orfs_add_header-<Header-Name>
) and implement checks on your file servers.
Fetching Files
FileSlide Streamer attempts to fetch and zip each of the files specified in the fs_uri_list
parameter with the following conditions:
- Each URI must be authorized with the upstream authorization API.
If one or more URIs are not authorized, FileSlide fails early and returns the
403 UNAUTHORIZED_URIS
status and key with those URIs listed in thefs_error_records
parameter. - FileSlide then makes a HTTP GET request for the first byte of each URI to check availability.
If one or more URIs are not available, FileSlide fails early and returns the
502 FAILED_FETCHING_URIS
status and key with those URIs listed in thefs_error_records
parameter. - Once all URIs are checked, FileSlide makes a GET for the full file and streams the data.
- If the
fs_forward_all_headers
orfs_add_header-<Header-Name>
parameters are used, FileSlide includes these headers in both the availability checking request and the full file request.
Resuming Downloads
FileSlide supports most resumable download cases by maintaing a cache of checksums and using multithreading with high concurrency to fetch file streams. Edge cases involving resuming zip downloads of large files (GBs) may not not be able to be fulfilled in time, in which case FileSlide falls-back to responding with the entire file. If you have a particular use case that requires specific resume/range request functionality please let us know and we'll be happy to find a solution for you.
Multiplex Downloads
FileSlide implements best-effort support for multiplex range requests depending on the file size, number of connections and the speed of downloading. If you have a particular use case that requires specific functionality please let us know and we'll be happy to find a solution for you.
Getting Started
FileSlide Streamer is a Ruby Sinatra application.
Running locally requires the following:
- Ruby >= 2.6.6
- Redis DB
The rerun gem is also recommended for development
To get started:
- Clone the repository
- Copy the env file
mv .env.example .env
- Update
REDIS_URI
in the.env
file bash scripts/start_dev.sh
to start the server
To run the demo with remote files and the remote test Authorization and Reporting endpoints:
- Open
test/static/local/demo.html
with a web browser
To run the demo with local files and the local demo Authorization and Reporting endpoints:
bash scripts/start_dev.sh
to start the serverbash scripts/start_test_file_server.sh
to start the local file server (serves fromtest/file_server/files
)bash scripts/start_test_upstream_server.sh
to start the local Authorization and Reporting server- Update the
.env
file and setAUTHORIZATION_ENDPOINT=http://localhost:9294/authorize
andREPORT_ENDPOINT=http://localhost:9294/report
- Open
test/static/local/demo.html
with a web browser
Testing
To run the test suite, cd test
then run bundle exec rspec
from the root folder of the repo. For testing the zip streaming, the test suite covers starts up a second Puma process to deliver static files from spec/fixtures. You can extend this server with extra static files or custom endpoints as needed.