Skip to content

Instantly share code, notes, and snippets.

@willwade
Created September 11, 2025 08:03
Show Gist options
  • Save willwade/f4888c7cc259b9be55542d7214ab9700 to your computer and use it in GitHub Desktop.
Save willwade/f4888c7cc259b9be55542d7214ab9700 to your computer and use it in GitHub Desktop.
CouchDB issues

Analysis Summary

1. Primary Sources of Data Storage in CouchDB

Data Types Being Stored:

  • GridData: Complete grid configurations including elements, images, thumbnails, and additional files
  • MetaData: Global user settings, input configurations, color schemes, and integration settings
  • Dictionary: Word prediction dictionaries with JSON data
  • EncryptedObject: All data is encrypted and stored as base64-encoded strings

Key Storage Contributors:

class GridData extends Model({
    id: String,
    modelName: String,
    modelVersion: String,
    lastUpdateTime: [Number],
    isShortVersion: Boolean, // if true this object represents a non-full short version excluding binary base64 data
    label: [Object, String], //map locale -> translation, e.g. "de" => LabelDE
    rowCount: [Number],
    minColumnCount: [Number],
    gridElements: Model.Array(GridElement),
    additionalFiles: [Model.Array(AdditionalGridFile)],
    webRadios: [Model.Array(Webradio)],
    thumbnail: [Object], // map with 2 properties: [data, hash], where "data" is base64 Screenshot data and "hash" is the hash of the grid when the screenshot was made,

Major Storage Issues Identified:

  1. Base64 Image Data: GridImage objects store image data as base64 strings, which increases size by ~33%
  2. Automatic Thumbnails: Grid thumbnails are automatically generated and stored as base64 screenshots
  3. Additional Files: AdditionalGridFile objects store entire files as base64 data
  4. No Data Retention Policies: No automatic cleanup of old data
  5. Redundant Storage: Both full and short versions of encrypted data are stored

2. Data Storage Patterns Analysis

Problematic Patterns:

if (updateThumbnail) {
    imageUtil.allImagesLoaded().then(async () => {
        let screenshot = await imageUtil.getScreenshot("#grid-container");
        let thumbnail = {
            data: screenshot,
            hash: newHash
        };
        dataService.saveThumbnail(this.renderGridData.id, thumbnail);
    })
}

Storage Inefficiencies:

  • Thumbnails are regenerated and saved frequently (every grid view)
  • Images are stored as base64 instead of binary attachments
  • No compression for large data objects
  • Sync conflicts create duplicate revisions that aren't cleaned up
dataUtil.removeLongPropertyValues = function (object, maxLength, removedPlaceholder) {
    if (!object) {
        return object;
    }
    removedPlaceholder = removedPlaceholder === undefined ? defaultRemovedPlaceholder : removedPlaceholder;
    maxLength = maxLength || 500;

3. Specific Recommendations to Reduce Storage Costs

High Impact Changes:

1. Implement Configurable Data Retention Policies

  • Add automatic cleanup of old grid revisions (keep only last N versions)
  • Implement user-configurable retention periods for different data types
  • Add scheduled cleanup of orphaned thumbnails and images

2. Optimize Image Storage

  • Move images to CouchDB attachments instead of base64 in documents
  • Implement image compression with configurable quality settings
  • Add lazy loading for thumbnails (generate on-demand vs. pre-generate)

3. Reduce Thumbnail Storage

  • Make thumbnail generation optional/configurable
  • Implement thumbnail caching with expiration
  • Use smaller thumbnail dimensions and higher compression

4. Add Data Cleanup Utilities

  • Extend existing cleanup scripts to handle user data retention
  • Add bulk operations for removing old/unused data
  • Implement database compaction scheduling

Medium Impact Changes:

5. Optimize Encryption Storage

  • Remove redundant encryptedDataBase64Short when not significantly different
  • Implement compression before encryption
  • Use more efficient serialization formats

6. Implement Smart Sync

  • Add incremental sync for large objects
  • Implement conflict resolution that removes old revisions
  • Add sync pause during bulk operations

Low Impact Changes:

7. Add Storage Monitoring

  • Implement storage usage tracking per user
  • Add warnings when approaching storage limits
  • Provide storage usage analytics in admin interface

8. Optimize Data Models

  • Remove unnecessary metadata from frequently updated objects
  • Implement lazy loading for large properties
  • Add data deduplication for common elements

4. Implementation Priority

Phase 1 (Immediate - High ROI):

  1. Implement configurable thumbnail generation (optional/on-demand)
  2. Add image compression with quality settings
  3. Extend existing cleanup scripts for user data retention
  4. Add database compaction automation

Phase 2 (Short-term):

  1. Move images to CouchDB attachments
  2. Implement smart sync with conflict cleanup
  3. Add storage monitoring and alerts
  4. Optimize encryption storage format

Phase 3 (Long-term):

  1. Implement comprehensive data lifecycle management
  2. Add advanced compression and deduplication
  3. Consider tiered storage for infrequently accessed data

5. Existing Infrastructure to Leverage

The codebase already has several utilities that can be extended:

async function getDeletedItems(pouch) {
    let toDelete = [];
    let ids = [];
    return new Promise((resolve) => {
        let options = {
            style: 'main_only'
        };
        pouch
            .changes(options)
            .on('complete', function (info) {

These existing cleanup and compaction scripts can be extended to handle user data retention policies and automated storage optimization.

The most impactful immediate change would be making thumbnail generation configurable and implementing basic data retention policies, which could reduce storage usage by 30-50% for typical users while maintaining full application functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment