remoteEaze
Features

File Storage & Uploads

Enterprise file storage with S3-compatible backends, direct-to-bucket uploads, content validation, and multi-tenant isolation.

The storage system handles secure file uploads using direct-to-bucket architecture. Rather than uploading files through the API server (which creates bottlenecks), files upload directly to S3-compatible storage (AWS S3, MinIO, SeaweedFS) via short-lived presigned URLs.

This approach enables large file support (up to 50 MB by default), real-time upload progress tracking, and better scalability under load.

How It Works

User selects file

Browser requests presigned URL from API

API returns temporary upload URL (expires in 5 min)

Browser uploads file directly to S3

Browser notifies API: "upload complete"

Background validation job scans the file

File transitions to READY status

The entire flow happens in three API calls: initiate, upload (direct to S3), and complete. The actual bytes never touch the API server.

Building Blocks

1. Object Classes

Defines what types of files are allowed where. Each class has validation rules and retention policies.

ClassAllowed TypesMax SizeRetention
USER_PROFILE_PHOTOJPEG, PNG, WebP5 MBPermanent
COMPANY_LOGOJPEG, PNG, WebP, SVG5 MBPermanent
CUSTOMER_SIGNATUREJPEG, PNG, WebP2 MBPermanent
KYC_IDENTITY_DOCUMENTJPEG, PNG, PDF10 MBCompliance Archive
KYC_SUPPORTING_DOCUMENTJPEG, PNG, PDF10 MBCompliance Archive
LOAN_DOCUMENTJPEG, PNG, PDF20 MBCompliance Archive
IMPORT_FILECSV, XLS, XLSX50 MB90 days
STATEMENTPDF20 MBCompliance Archive
REPORTPDF, CSV20 MBCompliance Archive
GENERAL_DOCUMENTJPEG, PNG, PDF20 MBPermanent

The system validates files against these rules even if the extension says "jpg".

2. Upload Sessions

Every upload attempt creates an upload session that tracks the lifecycle from presign generation through completion. Sessions capture:

FieldPurpose
initiatedByWho started the upload
presignExpiresAtWhen the upload URL expires
expectedContentTypeWhat content type was declared
expectedFileSizeWhat size was declared
policySnapshotCopy of validation rules at initiation time

This creates an audit trail independent of the stored object itself.

3. Validation Pipeline

After upload completes, a background job verifies file integrity:

CheckPurpose
Magic bytesFile signature matches declared content-type (prevents extension spoofing)
Image dimensionsWidth and height within bounds (max 8192x8192, prevents decompression bombs)
SVG safetyDetects embedded scripts and event handlers (prevents XSS)
PDF structureValidates PDF header format
CSV structureChecks for valid delimiters and line breaks

If validation fails, the file is rejected and scheduled for cleanup.

Security Layers

Presigned URL Expiry

Upload and download URLs are temporary and cryptographically signed:

URL TypeExpiryPurpose
Upload URL5 minutesOne-time upload to specific key
Download URL15 minutesAuthorized access to file

URLs are bound to specific keys and cannot be used to access other objects.

Content Verification

The system inspects actual file contents, not just extensions:

File TypeValidation Method
JPEGMagic bytes FF D8 FF + dimension extraction from EXIF
PNGMagic bytes 89 50 4E 47 + IHDR chunk dimensions
WebPRIFF + WEBP signature
PDF%PDF- header validation
SVGScript element and event handler detection
ExcelOLE2 or ZIP (OOXML) signature

Encryption

All files stored with server-side encryption (AES256). Copied objects inherit the same encryption.

Tenant Isolation

Files are organized hierarchically in the bucket by tenant:

tenants/\{tenantId\}/\{ownerEntityType\}/\{ownerEntityId\}/\{objectId\}

Example:

tenants/acme-corp/User/usr-123/obj-abc123.jpg
tenants/globex-inc/Customer/cust-456/obj-def456.pdf

Database queries always filter by tenantId. The unique constraint @@unique([tenantId, storageKey]) ensures no key collisions within a tenant while allowing identical paths across tenants.

Cross-tenant access is impossible because:

  1. Repository queries include WHERE tenantId = X
  2. Storage keys contain the tenant prefix at the root level
  3. Object IDs from one tenant resolve to null when queried from another tenant's scope

Checksum Verification

If the client provides an MD5 checksum, it is verified against the S3 ETag after upload. Checksum mismatches trigger automatic deletion of the corrupted file.

Retention Policies

Files are automatically cleaned up based on their retention policy:

PolicyDurationAuto-CleanupUse Case
PERMANENTForeverNoProfile photos, logos
COMPLIANCE_ARCHIVEForeverNoKYC docs, loan agreements
SUPERSEDED_30D30 daysYesOld versions of replaceable files
TEMP_UPLOAD_24H24 hoursYesAbandoned uploads
FAILED_UPLOAD_7D7 daysYesRejected files (validation failed)
IMPORT_ARTIFACT_90D90 daysYesTemporary import files
LEGAL_HOLDForeverBlockedLitigation/compliance holds

A background job runs hourly to clean up expired objects. Objects under legal hold are never deleted regardless of retention policy.

Replacement and Supersession

Certain object classes support replacement (profile photos, logos). When a new file replaces an existing one:

New upload initiated

Previous file marked as superseded

Superseded file gets SUPERSEDED_30D policy

New file becomes current (isCurrent = true)

After 30 days, superseded file is cleaned up

Document classes (KYC, loan docs) do not support replacement — multiple files can be current simultaneously.

Mark sensitive files with legal hold to block deletion regardless of retention policy. This is useful for:

  • Litigation holds
  • Compliance investigations
  • Audit requirements

Once placed, a hold can only be released by an authorized user. The hold reason and timestamp are recorded in the audit trail.

API Usage

Uploading a File

Step 1: Initiate

POST /api/v1/uploads/initiate
{
  "objectClass": "USER_PROFILE_PHOTO",
  "ownerEntityType": "User",
  "ownerEntityId": "usr_123",
  "originalFilename": "photo.jpg",
  "contentType": "image/jpeg",
  "fileSize": 1024000
}

Response:

{
  "storedObject": {
    "id": "obj_abc123",
    "storageKey": "tenants/acme-corp/User/usr_123/obj_abc123",
    "status": "PENDING_UPLOAD"
  },
  "uploadUrl": "https://s3.amazonaws.com/...",
  "uploadUrlExpiresAt": "2026-01-15T10:35:00Z",
  "sessionId": "sess_xyz789"
}

Step 2: Upload

PUT the file bytes directly to uploadUrl.

Step 3: Complete

POST /api/v1/uploads/obj_abc123/complete
{
  "checksum": "d41d8cd98f00b204e9800998ecf8427e"
}

The object transitions to SCANNING status while validation runs asynchronously. Poll the object status or wait for webhook notification (if configured).

Downloading a File

Request a presigned download URL:

POST /api/v1/uploads/obj_abc123/download-url?disposition=attachment

Response:

{
  "downloadUrl": "https://s3.amazonaws.com/...",
  "expiresAt": "2026-01-15T10:50:00Z",
  "contentType": "image/jpeg",
  "originalFilename": "photo.jpg",
  "fileSize": "1024000"
}

The URL expires in 15 minutes. Disposition can be attachment (forces download) or inline (browser display).

Listing Files

GET /api/v1/uploads?ownerEntityType=User&ownerEntityId=usr_123&isCurrent=true

Returns current files for the specified owner. Supports filtering by object class, status, and pagination.

Deleting a File

DELETE /api/v1/uploads/obj_abc123

Soft-deletes the database record and hard-deletes the bucket object. Blocked if legal hold is active.

Frontend Integration

The useUploadFile hook manages the full lifecycle:

const { upload, cancel, state, progress } = useUploadFile();

// Initiates upload, uploads via XHR with progress, completes automatically
await upload(file, {
  objectClass: 'CUSTOMER_PHOTO',
  ownerEntityType: 'Customer',
  ownerEntityId: customerId
});

// State: 'idle' | 'initiating' | 'uploading' | 'completing' | 'done' | 'error'
// Progress: { loaded, total, percentage }

Uploads can be cancelled mid-flight. The hook handles cleanup of stale requests.

Real-Time Progress Updates

The upload system uses a two-phase progress model optimized for direct-to-bucket uploads:

Phase 1: Upload Progress (Client-Side)

Since files upload directly to S3 (not through the API server), progress tracking happens entirely in the browser using XMLHttpRequest (XHR) progress events:

File selected

XHR opened to presigned S3 URL

xhr.upload.onprogress fires repeatedly

UI updates: "Uploading... 45% (1.2 MB / 2.5 MB)"

Upload completes

Why XHR instead of Fetch API? The Fetch API does not expose upload progress events. Only XHR provides xhr.upload.addEventListener("progress", ...) for real-time byte counters.

Progress data structure:

FieldDescription
loadedBytes uploaded so far
totalTotal file size in bytes
percentageCalculated as loaded / total (0-100%)

No WebSocket or Server-Sent Events: The system intentionally avoids WebSocket/SSE connections for uploads. Since the data flows browser→S3 directly, there's no server involvement during the transfer that could publish progress updates.

Phase 2: Validation Status (Polling)

After the upload completes, a background validation job scans the file. The client polls for status changes:

StatusMeaningPolling
SCANNINGValidation in progressPoll every 2 seconds
READYValidation passedStop polling
REJECTEDValidation failedStop polling

Polling implementation:

POST /complete returns { status: "SCANNING" }

GET /uploads/:id every 2 seconds

Status change detected

UI updates: "Processing..." → "Complete" or "Failed"

Why polling instead of WebSocket?

  • Validation is typically 1-3 seconds (magic byte checks, not heavy processing)
  • Polling is simpler for intermittent/mobile connectivity
  • No persistent connection overhead for short-lived operations

Progress State Machine

idle
  ↓ (file selected)
initiating
  ↓ (presign URL obtained)
uploading ← [XHR progress events]
  ↓ (S3 confirms upload)
completing
  ↓ (API notified, validation queued)
SCANNING ← [poll every 2s]

READY / REJECTED / ERROR

Cancellation Support

Uploads can be cancelled at any point before completion:

PhaseCancellation Method
initiatingAbortController signal
uploadingxhr.abort()
completingAbortController signal
SCANNINGCannot cancel (already in S3)

Cancelled uploads leave orphaned S3 objects that the cleanup job removes automatically after 24 hours.

Offline Support (PWA)

For field agents working without connectivity:

CapabilityImplementation
Local draftsIndexedDB (Dexie) stores attachment metadata
Upload queueFiles queued for upload when connection restored
Progress trackingUpload status tracked across sessions
Retry logicFailed uploads retried with exponential backoff

When connectivity returns, the sync process uploads queued files and updates local records with server-generated IDs.

Configuration

Environment variables control storage behavior:

VariableDefaultDescription
STORAGE_ENDPOINTRequiredS3-compatible endpoint URL
STORAGE_REGIONus-east-1AWS region or default
STORAGE_BUCKETRequiredBucket name
STORAGE_ACCESS_KEYRequiredS3 access key
STORAGE_SECRET_KEYRequiredS3 secret key
STORAGE_PRESIGN_UPLOAD_EXPIRY_SECONDS300Upload URL validity (5 min)
STORAGE_PRESIGN_DOWNLOAD_EXPIRY_SECONDS900Download URL validity (15 min)
STORAGE_MAX_FILE_SIZE_BYTES52428800Maximum file size (50 MB)

The system supports AWS S3, MinIO, and SeaweedFS as backends.

Audit Trail

Every storage action is recorded in the ObjectEvent table:

ActionWhen Recorded
UPLOAD_INITIATEDPresign URL generated
UPLOAD_COMPLETEDClient called complete endpoint
UPLOAD_FAILEDUpload verification failed
SCAN_PASSEDBackground validation succeeded
SCAN_FAILEDBackground validation failed
DOWNLOAD_URL_ISSUEDPresign URL generated for download
OBJECT_DELETEDFile deleted by user
OBJECT_EXPIREDCleanup job removed expired file
HOLD_PLACEDLegal hold applied
HOLD_RELEASEDLegal hold removed

Each entry includes actor ID, timestamp, request ID for correlation, and result status (success/failure).

Error Handling

Common error scenarios and their handling:

ScenarioResponseRecovery
Presign expired400 Bad RequestRe-initiate upload
File too large400 Bad RequestCompress or select smaller file
Invalid content type400 Bad RequestConvert to allowed format
Checksum mismatch400 Bad RequestRe-upload (file corrupted)
Validation failed200 + REJECTED statusFile removed automatically
Legal hold active403 ForbiddenRemove hold before deleting
Object not found404 Not FoundVerify object ID and permissions

Validation failures set the object status to REJECTED and trigger automatic cleanup after 7 days.

Storage Key Structure

Keys follow a predictable hierarchy for organization and potential IAM policy restrictions:

tenants/\{tenantId\}/\{ownerEntityType\}/\{ownerEntityId\}/\{objectId\}
ComponentDescription
tenants/Static prefix for all tenant-scoped objects
{tenantId}Tenant identifier (e.g., "acme-corp")
{ownerEntityType}Entity category: User, Customer, Account, LoanActivity, etc.
{ownerEntityId}Specific entity instance ID
{objectId}Unique object identifier (UUID)

This structure enables:

  • Prefix-based listing if needed
  • IAM policies scoped to tenant prefixes
  • Logical organization in S3 consoles
  • Efficient cleanup by tenant or entity

Cleanup and Maintenance

The storage cleanup job runs on a schedule to remove expired objects:

Every hour:
  1. Find objects where retentionExpiresAt <= now()
  2. Skip if isLegalHold = true
  3. Skip if policy is PERMANENT or COMPLIANCE_ARCHIVE
  4. Soft-delete database record
  5. Hard-delete bucket object
  6. Record OBJECT_EXPIRED event
  7. Continue in batches of 100

Orphaned bucket objects (where DB delete succeeded but S3 delete failed) are eventually consistent — a separate reconciliation process can identify and remove them.

On this page