Skip to content

Instantly share code, notes, and snippets.

@jorendorff
Created February 20, 2018 21:42
Show Gist options
  • Select an option

  • Save jorendorff/94417ca6c6212f1dc0e283bc0e218b84 to your computer and use it in GitHub Desktop.

Select an option

Save jorendorff/94417ca6c6212f1dc0e283bc0e218b84 to your computer and use it in GitHub Desktop.
diff --git a/js/public/StructuredClone.h b/js/public/StructuredClone.h
--- a/js/public/StructuredClone.h
+++ b/js/public/StructuredClone.h
@@ -15,26 +15,126 @@
#include "jstypes.h"
#include "js/RootingAPI.h"
#include "js/TypeDecls.h"
#include "js/Value.h"
#include "js/Vector.h"
+// API for the HTML5 internal structured cloning algorithm.
+//
+// The algorithm is specified in terms of examining one JS value and
+// constructing another. In practice, we often need the clone to be created in
+// another thread, process, or compartment. So we split the algorithm into two
+// parts:
+//
+// * The "write" phase examines the source value and writes a binary
+// description of the clone we want to create.
+//
+// * The "read" phase takes this description and actually creates the clone.
+//
+// We can transmit the bytes from one process to another, etc., so the second
+// phase can run whenever and wherever we need the clone.
+//
+// For two ways the reality is a bit more complicated than the theory, see
+// JS_STRUCTURED_CLONE_VERSION and JS::StructuredCloneScope below.
+
struct JSStructuredCloneReader;
struct JSStructuredCloneWriter;
-// API for the HTML5 internal structured cloning algorithm.
+// The structured-clone serialization format version number.
+//
+// When serialized data is stored as bytes, e.g. in your Firefox profile, later
+// versions of the engine may have to read it. When you upgrade Firefox, we
+// don't crawl through your whole profile converting all saved data from the
+// previous version of the serialization format to the latest version. So it is
+// normal to have data in old formats stored in your profile.
+//
+// The JS engine can *write* data only in the current format.
+//
+// It can *read* data written by earlier versions.
+//
+//
+// ## When to bump this version number
+//
+// When making a change so drastic that the JS engine needs to know whether
+// it's reading old or new serialized data in order to handle both correctly,
+// increment this version number. Make sure the engine can still read all
+// old data written with previous versions.
+//
+// If StructuredClone.cpp doesn't contain code that distinguishes between
+// version 8 and version 9, there should not be a version 9.
+//
+// Do not increment for changes that only affect SameProcess encoding.
+//
+// Increment only for changes that would otherwise break old serialized data.
+// Do not increment for new data types. (Rationale: Modulo bugs, older versions
+// of the JS engine can already correctly throw errors when they encounter new,
+// unrecognized features. A version number bump does not actually help them.)
+//
+#define JS_STRUCTURED_CLONE_VERSION 8
namespace JS {
+// Indicates what can be done with serialized data between writing and reading.
+//
+// Writing plain JSON data produces an array of bytes that can be copied and
+// read in another process or whatever. The serialized data is Plain Old Data.
+// However, HTML also supports `Transferable` objects, which, when cloned, can
+// be moved from the source object into the clone, like when you take a
+// photograph of someone and it steals their soul.
+// See <https://developer.mozilla.org/en-US/docs/Web/API/Transferable>.
+// We support cloning and transfering many types of object.
+//
+// For example, when we transfer an ArrayBuffer across threads, we "detach" the
+// ArrayBuffer, embed the raw buffer pointer in the serialized data,
+// and later install it in a new ArrayBuffer on the destination thread.
+//
+// Ownership of that buffer memory is transfered from the original ArrayBuffer
+// to the serialized data and then to the clone.
+//
+// This only makes sense within a single process. When we transfer an
+// ArrayBuffer to another process, the contents of the buffer must be copied
+// into the serialized data.
+//
+// ArrayBuffers are actually a lucky case; some objects can't reasonably be
+// transferred by value into serialized data -- it's pointers or nothing.
+//
+// So there is a tradeoff between "scope" -- how far away the serialized data
+// may be sent and read -- and efficiency or features.
enum class StructuredCloneScope : uint32_t {
+ // The most restrictive scope, with greatest efficiency and features.
+ //
+ // When writing, this means: The caller promises that the serialized data
+ // will **not** be shipped off to a different thread/process or stored in a
+ // database. It's OK to produce serialized data that contains pointers.
+ // In Rust terms, the serialized data will be treated as `!Send`.
+ //
+ // When reading, this means: The caller promises that the serialized data
+ // was written in the current thread and process. (???)
SameProcessSameThread,
+
+ // When writing, this means: The caller promises that the serialized data
+ // will **not** be shipped off to a different process or stored in a
+ // database. However, it may be shipped to another thread. It's OK to
+ // produce serialized data that contains pointers to data that is safe to
+ // send across threads, such as array buffers.
+ //
+ // When reading, this means: The caller promises that the serialized
+ // data was written in the current process. (???)
SameProcessDifferentThread,
+
+ // The broadest scope.
+ //
+ // When writing, this means: Produce serialized data that can be sent to
+ // other processes, bitwise copied, or even stored as bytes in a database
+ // and read by later versions of Firefox years from now. Transferable
+ // objects are limited to ArrayBuffers, whose contents are copied into
+ // the serialized data (rather than just writing a pointer).
DifferentProcess
};
enum TransferableOwnership {
/** Transferable data has not been filled in yet */
SCTAG_TMO_UNFILLED = 0,
/** Structured clone buffer does not yet own the data */
@@ -163,22 +263,16 @@ typedef bool (*TransferStructuredCloneOp
/**
* Called when freeing an unknown transferable object. Note that it
* should never trigger a garbage collection (and will assert in a
* debug build if it does.)
*/
typedef void (*FreeTransferStructuredCloneOp)(uint32_t tag, JS::TransferableOwnership ownership,
void* content, uint64_t extraData, void* closure);
-// The maximum supported structured-clone serialization format version.
-// Increment this when anything at all changes in the serialization format.
-// (Note that this does not need to be bumped for Transferable-only changes,
-// since they are never saved to persistent storage.)
-#define JS_STRUCTURED_CLONE_VERSION 8
-
struct JSStructuredCloneCallbacks {
ReadStructuredCloneOp read;
WriteStructuredCloneOp write;
StructuredCloneErrorOp reportError;
ReadTransferStructuredCloneOp readTransfer;
TransferStructuredCloneOp writeTransfer;
FreeTransferStructuredCloneOp freeTransfer;
};
diff --git a/js/src/vm/StructuredClone.cpp b/js/src/vm/StructuredClone.cpp
--- a/js/src/vm/StructuredClone.cpp
+++ b/js/src/vm/StructuredClone.cpp
@@ -427,17 +427,17 @@ struct JSStructuredCloneReader {
// format (eg a Transferred ArrayBuffer can be stored as a pointer for
// SameProcessSameThread but must have its contents in the clone buffer for
// DifferentProcess.)
JS::StructuredCloneScope storedScope;
// Stack of objects with properties remaining to be read.
AutoValueVector objs;
- // Stack of all objects read during this deserialization
+ // Collection of all objects read during this deserialization
AutoValueVector allObjs;
// The user defined callbacks that will be used for cloning.
const JSStructuredCloneCallbacks* callbacks;
// Any value passed to JS_ReadStructuredClone.
void* closure;
@hotsphink
Copy link
Copy Markdown

hotsphink commented Feb 21, 2018

  • *transferring

  • transfer an ArrayBuffer across threads -- doesn't need to be across threads. (Parenthetical (possibly across threads)?)

  • I guess we'll continue to use the "same process" terminology. I half want to say "same address space", but I guess it's a useless distinction.

  • I was thinking "scope" in terms of "scope of validity" -- X is valid within the same thread / same address space / anywhere.

  • so, for scope: the scope is expected to be tracked external to the data. So clone data is really a tuple of <version, scope, data>, where the interpretation of the data is partially determined by the scope. (And the version is... I dunno, the version is kinda useless.) So read is really one of read<SameThread>, read<SameProcess>, read<DifferentProcess>, and in theory you have 9 possible combinations but in practice you can treat it as an ordered set of restrictiveness. So it's read<AllowedScope>, write<StoredScope> or something. AllowedScope is used for an access rights check, and StoredScope determines how the data should be stored (on the write side) and interpreted (on the read side). Your StructuredCloneScope comments, I believe, pertain to StoredScope.

  • ArrayBuffers are the standard Transferable, though I wonder if MessagePort might be a more illustrative example for some things.

  • My JS_STRUCTURED_CLONE_VERSION comment is embarrassing after having read yours. I'm kinda -- no, just outright -- lying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment