Building a Looker Studio Community Connector for Speckle API

Let's look at the Speckle APIs and 3 methods getting data into Looker Studio.

Speckle Service APIs: GraphQL vs. REST

Speckle provides both GraphQL and REST APIs to interact with your data on a Speckle Server. They serve similar purposes (accessing streams, commits, objects, etc.) but differ in how you structure your requests and receive data.

REST API:
- Concept: REST (Representational State Transfer) is an architectural style that uses standard HTTP methods (GET, POST, PUT, DELETE) to perform operations on resources identified by URLs.
- How it works: You make requests to specific endpoints (URLs) that represent resources (like a stream, a commit, or a list of objects). The server responds with a fixed structure of data, usually in JSON format.
- Pros:
  - Widely understood and used.
  - Simple to use for basic operations (e.g., GET /streams/:streamId/commits/:commitId/objects to get a list of objects).
  - Good for fetching predefined resources.
- Cons:
  - Can lead to "over-fetching" (receiving more data than you need) or "under-fetching" (requiring multiple requests to get all related data).
  - Less flexible for complex queries that involve fetching data from multiple related resources in a single request.
- Relevance to your links: The https://speckle.guide/dev/server-rest-api.html#downloading-a-list-of-objects link points directly to a REST API endpoint example.
GraphQL API:
- Concept: GraphQL is a query language for APIs and a runtime for fulfilling those queries with your existing data. It allows clients to request exactly the data they need.
- How it works: You send a single query to a single endpoint (usually /graphql). The query specifies the structure of the data you want back, including nested relationships. The server responds with a JSON object that precisely matches the structure of your query.
- Pros:
  - Avoids over-fetching and under-fetching by allowing you to request only the specific fields you need.
  - More efficient for complex queries that involve traversing relationships between data (e.g., getting a commit and details about its author and the objects it contains in one request).
  - Strongly typed schema provides better data validation and introspection.
- Cons:
  - Requires understanding the GraphQL query language.
  - Can be more complex to set up on the server side.
  - Caching can be more involved compared to REST.
- Relevance to your links: While the specific links you provided focus on REST, the Speckle documentation generally covers its GraphQL API as the primary way to interact programmatically for more complex applications.

For building a Looker Studio connector or data pipeline, you could potentially use either API depending on the complexity of the data you need to extract. For simply getting a list of objects from a commit as shown in the REST example, the REST API is straightforward. For more complex queries involving user info, stream details, and object properties in a single fetch, GraphQL might be more efficient.

Methods for Connecting Speckle Data to Looker Studio

Here are the pros and cons of the three methods you proposed:

Create your own Looker Studio Connector with Google Apps Script:
- Pros:
  - Direct Integration: Provides the most direct connection between Speckle and Looker Studio.
  - Customizable Schema: You have full control over which Speckle properties become fields in Looker Studio and how they are mapped.
  - Potential for Dynamic Configuration: Can allow users to select streams/commits directly within the Looker Studio data source setup.
  - Reusable: Once built, the connector can be used by anyone with access to it.
- Cons:
  - Development Effort: Requires significant coding in Google Apps Script (or your chosen web service language) to implement the connector interface (getConfig, getSchema, getData, etc.).
  - Complexity of Data Mapping: Handling the nested structure of Speckle data and flattening it for Looker Studio requires careful logic.
  - Performance: Google Apps Script execution time limits and potential API call latency can impact performance for large datasets.
  - Maintenance: You are responsible for maintaining the connector and updating it if Speckle's API changes.
Create a download tool using Google Apps Script that pulls data into Google Sheets, then connect Looker Studio with Google Sheets:
- Pros:
  - Simpler Development (for the script): Writing a script to just pull data and put it into a sheet is generally simpler than building a full connector.
  - Familiar Data Source: Google Sheets is a very user-friendly and familiar data source for many Looker Studio users.
  - Easy Data Inspection: You can easily view and even manually edit the data in Google Sheets if needed.
  - Leverages Existing Looker Studio Connector: Looker Studio has a robust native connector for Google Sheets.
- Cons:
  - Scalability Limits: Google Sheets has limitations on the number of rows and cell data, making it unsuitable for very large BIM datasets.
  - Performance: Querying large sheets from Looker Studio can be slow.
  - Manual Refresh/Triggering: The Apps Script needs to be triggered (manually, on a time-based trigger, or via a web hook) to update the data in the sheet, meaning the Looker Studio report won't be real-time unless the script runs very frequently.
  - Data Structure Flattening Still Required: The Apps Script will still need to handle flattening the Speckle data structure before writing it to the sheet.
Create a download tool that pulls data into Google BigQuery, then Looker Studio connects to BigQuery:
- Pros:
  - Scalability and Performance: BigQuery is designed for large datasets and provides excellent query performance, making it suitable for large and complex BIM models.
  - Centralized Data Warehouse: BigQuery can serve as a central repository for your Speckle data and potentially other construction project data.
  - Leverages Existing Looker Studio Connector: Looker Studio has a highly optimized native connector for BigQuery.
  - Data Transformation Power: You can use BigQuery's powerful SQL capabilities to transform and model the data after it's loaded, potentially simplifying the data extraction script.
- Cons:
  - Increased Complexity: Requires setting up and managing a BigQuery instance.
  - Cost: BigQuery usage incurs costs (storage and querying).
  - Development Effort (for the tool): You still need to build a tool (can be Google Apps Script, Python script, etc.) to extract data from Speckle and load it into BigQuery. This might involve using BigQuery APIs or client libraries.
  - Data Loading Process: You need to establish a reliable process for loading data into BigQuery (e.g., scheduled jobs, streaming inserts).

Where to Find Help in the Community:

For all these approaches, the community is a valuable resource:

Speckle Community Forum: discourse.speckle.dev is the official Speckle community forum. This is the best place to ask questions specifically about the Speckle API (REST or GraphQL), data structures, and best practices for extracting data from Speckle.
Looker Studio Community Forum: Google hosts a community forum for Looker Studio users and developers. This is the place to ask questions about building Community Connectors, using Google Apps Script with Looker Studio, connecting to Google Sheets or BigQuery, and general Looker Studio features.
Google Apps Script Community: There are forums and resources specifically for Google Apps Script development (e.g., Stack Overflow with the google-apps-script tag). This is useful for questions about writing the scripts for data extraction or connector implementation.
Stack Overflow: A general programming Q&A site. You can find answers to specific coding problems related to making HTTP requests, parsing JSON, or working with Google Cloud services (like BigQuery) or Google Sheets APIs. Use relevant tags like speckle, looker-studio, google-apps-script, google-bigquery, rest, graphql, etc.
GitHub: Look for existing open-source Speckle connectors or data extraction scripts on GitHub. Studying existing code can be a great way to learn.

/**
 * Fetches data from a specified REST API endpoint and writes it to a Google Sheet.
 *
 * --- Important Notes & Quotas ---
 * When using Google Apps Script to fetch data from external APIs and write to Google Sheets,
 * it's important to be aware of the quotas and limitations imposed by Google.
 *
 * Key Limitations:
 * - Execution Time: Scripts have a maximum execution time limit (e.g., 6 minutes for consumer accounts).
 * - UrlFetchApp Quotas: Daily limits on the number of calls using UrlFetchApp.fetch().
 * - Spreadsheet Operations: Limits on reads/writes per execution or per day.
 * - Total Trigger Runtime: Daily runtime limit across all triggers.
 *
 * Useful Documentation Links:
 * - Quotas for Google Services: This is the primary page listing all Apps Script quotas.
 * It's essential reading to understand the limits that apply to your account type.
 * https://developers.google.com/apps-script/guides/services/quotas
 *
 * - UrlFetchApp Reference: Learn more about the UrlFetchApp service and its methods.
 * https://developers.google.com/apps-script/reference/url-fetch/url-fetch-app
 * --- End Important Notes & Quotas ---
 */
function fetchDataAndWriteToSheet() {

  // --- Configuration ---
  // Replace with the URL of the REST API endpoint you want to fetch data from.
  var apiUrl = 'YOUR_API_URL_HERE';

  // Replace with the ID of your Google Sheet. You can find this in the sheet's URL.
  var sheetId = 'YOUR_SHEET_ID_HERE';

  // Replace with the name of the sheet tab where you want to write the data.
  var sheetName = 'Sheet1';

  // Set to true if the first row of your sheet should be treated as headers
  // and not overwritten by API data.
  var hasHeaders = true;
  // --- End Configuration ---


  try {
    // 1. Fetch data from the API
    // Be mindful of UrlFetchApp quotas, especially for frequent or large requests.
    // See: https://developers.google.com/apps-script/guides/services/quotas
    var response = UrlFetchApp.fetch(apiUrl);
    var jsonString = response.getContentText();
    var data = JSON.parse(jsonString);

    // Log the fetched data for debugging (optional)
    Logger.log(data);

    // 2. Get the Google Sheet and the target sheet tab
    // Using getActiveSpreadsheet() is convenient if the script is bound to the sheet,
    // but openById(sheetId) is necessary if the script is standalone.
    var spreadsheet = SpreadsheetApp.getActiveSpreadsheet(); // Or SpreadsheetApp.openById(sheetId);
    var sheet = spreadsheet.getSheetByName(sheetName);

    if (!sheet) {
      Logger.log('Sheet with name "' + sheetName + '" not found.');
      return; // Exit if the sheet is not found
    }

    // 3. Prepare data for writing to the sheet
    // This part is highly dependent on the structure of the JSON data returned by your API.
    // You will likely need to adapt this section to extract the specific data
    // you want to put into your sheet rows and columns.

    // Example: Assuming the API returns an array of objects, and you want to write
    // the values of 'key1', 'key2', and 'key3' from each object into columns.
    var dataToWrite = [];
    if (Array.isArray(data)) { // Check if the top level is an array
      // Assuming each item in the array is an object representing a row
      data.forEach(function(item) {
        var row = [];
        // Adapt these keys to match the keys in your API's JSON response
        row.push(item.key1 || ''); // Use || '' to handle potential missing keys
        row.push(item.key2 || '');
        row.push(item.key3 || '');
        // Add more columns as needed...
        dataToWrite.push(row);
      });
    } else if (typeof data === 'object' && data !== null) { // Assuming the top level is a single object
       // If the API returns a single object, you might need to extract data differently.
       // Example: If the data is under a 'results' key and is an array
       if (Array.isArray(data.results)) {
         data.results.forEach(function(item) {
           var row = [];
           // Adapt these keys to match the keys in your API's JSON response
           row.push(item.key1 || '');
           row.push(item.key2 || '');
           row.push(item.key3 || '');
           // Add more columns as needed...
           dataToWrite.push(row);
         });
       } else {
         // Handle other object structures as needed
         Logger.log("API response is an object but not an array of results. Adapt the code to handle this structure.");
         // Example: If it's a single row object
         var row = [];
         row.push(data.key1 || '');
         row.push(data.key2 || '');
         row.push(data.key3 || '');
         dataToWrite.push(row);
       }
    } else {
      Logger.log("API response is not a recognized format (array or object). Adapt the code to handle this structure.");
      return; // Exit if data format is unexpected
    }


    // 4. Clear existing data in the sheet (optional, but often useful)
    // Determine the starting row for writing data. If hasHeaders is true, start from row 2.
    var startRow = hasHeaders ? 2 : 1;
    var lastRow = sheet.getLastRow();

    if (lastRow >= startRow) {
       // Clear content from the starting row down to the last row
       sheet.getRange(startRow, 1, lastRow - startRow + 1, sheet.getMaxColumns()).clearContent();
    }


    // 5. Write the new data to the sheet
    // Be mindful of potential spreadsheet operation quotas for very large datasets.
    // See: https://developers.google.com/apps-script/guides/services/quotas
    if (dataToWrite.length > 0) {
      var range = sheet.getRange(startRow, 1, dataToWrite.length, dataToWrite[0].length);
      range.setValues(dataToWrite);
      Logger.log('Data successfully written to sheet.');
    } else {
      Logger.log('No data to write to the sheet.');
    }

  } catch (e) {
    // Log any errors that occur
    // Error handling is crucial, especially when dealing with external APIs and network issues.
    Logger.log('Error fetching data or writing to sheet: ' + e.toString());
  }
}

/**
 * Helper function to demonstrate how to trigger the main function.
 * You can set this function to run on a time-driven trigger.
 *
 * Be aware of the total trigger runtime quota.
 * See: https://developers.google.com/apps-script/guides/services/quotas
 */
function setupTrigger() {
  // Delete existing triggers for this function to avoid duplicates
  var triggers = ScriptApp.getProjectTriggers();
  for (var i = 0; i < triggers.length; i++) {
    if (triggers[i].getHandlerFunction() === 'fetchDataAndWriteToSheet') {
      ScriptApp.deleteTrigger(triggers[i]);
    }
  }

  // Create a new time-driven trigger to run fetchDataAndWriteToSheet daily
  ScriptApp.newTrigger('fetchDataAndWriteToSheet')
      .timeBased()
      .everyDays(1) // Or everyHours(1), everyMinutes(5), etc.
      .create();

  Logger.log('Daily trigger created for fetchDataAndWriteToSheet.');
}

rifkiamil/Looker_Studio_Community_Connector_for_Speckle.md