Dropbox
Requirements
Functional Requirements
- Users should be able to upload a file from any device.
- Users should be able to download a file from any device
- Users should be able to share a file with other users and view the files shared with them.
Non-Functional Requirements
- The system should be highly available (prioritizing availability over consistency).
- The system should support files as large as 50GB.
- The system should be secure and reliable. We should be able to recover files if they are lost or corrupted.
- The system should make upload, download, and sync times as fast as possible (low latency).
The Set Up
Defining the Core Entities
- User
- File: raw data.
- FileMetadata: Include file name, size, the user who uploaded it.
API or System Interface
- Upload
POST /files
Request:
{
File,
FileMetadata
}
- Download
GET /files/{fileId} -> File & FileMetadata
- Share
POST /files/{fileId}/share
Request:
{
User[] // The users to share the file with
}
High-Level Design
Upload Files
- For metadata, we can use a NoSQL database like DynamoDB.
Bad Solution: Uploading file to single server
- Not scalable, need to add more storage to servers when the number of files grows.
- Not reliable, if a server is down, we lose access to all of the files.
Good Solution: Storage File in Blob Storage
-
We send the file to a Blob Storage like Amazon S3, Google Cloud Storage and store the metadata in our database.
-
Challenge 1: Either metadata or file is lost
- Handle the case where the file is uploaded but the metadata is not stored.
- Or the metadata is stored but the file is not uploaded.
- Solution: Only save the metadata if the file is uploaded successfully.
-
Challenge 2: Redundant upload
- Once to our backend, once to the cloud storage.
- Solution: To allow user to upload the file directly to the Blob Storage Service.
Great Solution
-
To allow user to upload the file directly to the Blob Storage Service.
-
We can use presigned URLs to generate a URL that the user can use to upload the file.
-
Once the file is uploaded, the Blob storage service will send us a notification to our backend so we can save metadata.
-
Presigned URLs: URLs that give the user permission to upload a file to a specific location in the Blob storage service.
-
Three step process:
- Request a presigned URL from the backend (which itself get the URL from the Blob storage service like S3) and save the file metadata on our database with a status of "uploading".
- Use the presigned URL to upload the file to the Blob storage service. This is via a PUT request directly to the presigned URL where the file is the body of the request.
- Once the file is uploaded, the Blob storage service will send us a notification to our backend. Our backend will update the file metadata in our database with a status of "uploaded".