BOSH Links: Why and How

Audience: anyone in the BOSH ecosystem, whether you work on something open-source or proprietary

"BOSH Links" is a feature which simplifies how data is shared between BOSH-deployed jobs that need to collaborate with one another (e.g. a web server and its backing database). Like many BOSH things, the whole "BOSH Links" thing can seem counter-intuitive at first, and it may not be clear why things are the way they are. This note hopes to show some of the powerful benefits of BOSH Links, and provide examples and explanations that make things more intuitive.

Why?
BASIC: A simple problem statement
BASIC: Putting the puzzle pieces together
ADVANCED: But what about...
ADVANCED: More features
ADVANCED: Links++
Action items for release authors

The "Why?" section tells you why Links were invented and why you would care, as a release author building releases for an operator to deploy, or as an operator juggling a bunch of BOSH deployments. The "BASIC" sections will familiarize you with the basic mechanics of Links, introducing some problems with manifests that need solving, showing how they can be solved with Links. The discussion here is fairly low-level and concrete, giving you most of the requisite know-how to work with links.

After reading the BASIC sections, take a break 😌 and tinker around with some releases and manifests yourself.

Sooner or later, you're probably going to have a bunch of "but what about..." questions about Links, common use cases which may not appear to play nicely with Links. Those will be addressed in the "ADVANCED: But what about..." section. The problems raised in the "BASIC" sections could be solved without Links, so it's important to see "ADVANCED: More features" to see how the same Links abstractions can be used to solve the real-world problems that can't easily be solved today in the BOSH ecosystem.

It's also really important to understand that the bigger problems that need to be solved when it comes to managing BOSH releases and manifests aren't solved by one or two features alone. Links need to be combined with other new features in the BOSH ecosystem to see the full benefits, which is discussed in the "ADVANCED: Links++" section.

Finally, if you're a release author, check out the final section to see how your work fits into bringing these benefits to the BOSH ecosystem.

1. Why?

As a standalone feature, links help simplify BOSH manifests, making them:

easier to read; and
simpler to build.

Bigger picture, these things would ideally lower the barrier to entry to using BOSH and consuming BOSH-releases, and enable lighterweight tooling to generate manifests, so that people's BOSH workflows aren't subject to "lock-in" (hard dependencies on tools like spiff, spruce, enaml, or Ops Manager). Furthermore, a real-world operator is managing multiple deployments (maybe Cloud Foundry, MySQL, Redis, RabbitMQ, etc.), not just one. Managing all the data that needs to be shared across them is very difficult and is a problem a lot of the aforementioned tools try to solve, all in their own different ways. BOSH Links are a crucial part of a BOSH-native solution to this problem.

Amplified by other "BOSH 2.0" features such as Global Networking, Cloud Config, Operations Files, and CredHub, Links allow manifests to be further simplified, removing most environment-specific details from manifests such as credentials, IPs, IaaS-specific cloud properties, and other common structural features (e.g. backing Cloud Foundry with an internal WebDAV blobstore vs. S3). This additionally makes manifests:

useful to share; and
reasonable to extend.

Ideally, these things further reduce friction within the ecosystem, and reduce error-prone duplication of effort, especially in cases where one wants to build a commercial distribution of some BOSH-deployable product on top of an open-source deployment configuration.

2. BASIC: A simple problem statement

Imagine a simple deployment with a web server and a database. The web server will need to know the address and port of the database, as well as username/password access credentials. Oftentimes, we don't want to have to:

specify the username and password twice in the manifest (once for the database to configure itself, and once for the web server to know how to talk to the database);
specify the username or password even once, something else could just generate some secure credentials, it just matters that web server and database have the same configuration;
specify the port explicitly, the database should default to binding to some port, and web server should have a way of discovering this port;
specify an IP for the database twice (again, once for the database VM itself, and once for the web server to know where the database is); and
specify the database IP even once, something else can pick an IP and just make sure both jobs know about this IP.

We'll ignore point 2 and 5 for now, but the rest can be addressed by BOSH Links alone.

Stepping back for a second, let's look at this from first principles and level-set the conversation. We want to run a database process and a web server process. Running these processes usually involes a start script and some configuration files. We package the source code and/or executable binaries together with templates for start scripts and configuration files inside our BOSH releases. A BOSH deployment manifest includes environment-specific values for properties and network addresses; these values are used to render the templates and those rendered templates turn into the actual start scripts and configuration files used at runtime.

Rather than having to specify this data multiple times in the manifest (once for consumers like web servers and once for providers like the database), Links allow a form of dependency injection, where the parameters for constructing/configuring the database only need to be specified (at most) once, and the web server can simply declare its dependency on the database configuration. BOSH Links then injects that data in the consuming job's (web server's) template rendering context. Another analogy is service discovery, in this case it's data discovery; the consumer job simply declares it wants to find "database" data, and ta-da, there it is in its template rendering context, an operator didn't need to manually feed it that data.

3. BASIC: Putting the puzzle pieces together

Let's look at how Links play into:

release job specs;
release job templates; and
a deployment manifest

to solve the problem.

The web_server's job spec declares its dependency:

~/workspace/releases/app/jobs/web_server/spec

---
name: web_server

templates:
  web_server_ctl.sh.erb: bin/web_server_ctl
  config.json.erb: config.json

packages:
  - web_server_binary

# DEPENDENCIES DECLARED IN THIS SECTION
consumes:
- {name: database, type: database}

properties:
  greeting:
    description: Greeting to display to the user when they reach the landing page
    default: Hello, user!
  port:
    description: Port that the web server binds to
    default: 8080

and uses its "database" dependency in one of its templates:

~/workspace/releases/app/jobs/web_server/templates/config.json.erb

{
  "database_address": "<%= link("database").instances[0].address %>",
  "database_port": <%= link("database").p("port") %>,
  "database_username": "<%= link("database").p("username") %>",
  "database_password": "<%= link("database").p("password") %>",
  "greeting": "<%= p("greeting") %>",
  "port": <%= p("port") %>
}

There are a couple more pieces to the puzzle, namely the database job spec and the relevant part of the BOSH manifest, but let me bring up a few questions which we'll address after:

What does the whole {name: database, type: database} thing mean? Are these magic strings?
What is link("database"), and what is .instances[0] about, and what is that .address accessor?
Speaking of which, what is the .p(...) method about on the link("database") object?

Here's the database job's spec:

~/workspace/releases/app/jobs/database/spec

---
name: database

templates:
  database_ctl.sh.erb: bin/database_ctl
  config.json.erb: config.json

packages:
  - database_binary

# EXPORTED DATA DECLARED IN THIS SECTION
provides:
- {name: database, type: database, properties: [port, username, password]}

properties:
  port:
    description: Port that the database binds to
    default: 5432
  username:
    description: Username used to access the database
  password:
    description: Password used to access the database
  max_connections:
    description: Maximum number of database connections
    default: 500

And let's look at an example manifest:

~/workspace/deployments/galileo_environment/my_app.yml

---
name: my_app

releases:
- name: app
  version: latest

instance_groups:
- name: web_server
  jobs:
  - release: app
    name: web_server
    properties:
      greeting: "Namaste!"
  # <uninteresting>
  instances: 3
  azs: [z1, z2, z3]
  networks: [{name: default}]
  vm_type: default
  stemcell: default
  # </uninteresting>
  
- name: database
  networks: [{name: default, static_ips: [10.0.16.4]}]
  jobs:
  - release: app
    name: database
    properties:
      username: admin
      password: passw0rd
  # <uninteresting>
  instances: 1
  azs: [z1]
  vm_type: default
  stemcell: default
  # </uninteresting>

# <uninteresting>
update:
  canaries: 1
  canary_watch_time: 10000-600000
  update_watch_time: 10000-600000
  max_in_flight: 1
  serial: true

stemcells:
- alias: default
  os: ubuntu-trusty
  version: latest
# </uninteresting>

If we look at these examples in the context of the 5 items enumerated in the simple problem statement, note that the web_server instance group is not provided any configuration in the manifest regarding the address or credentials of the database. Those data are only specified once, and only on the database instance group (namely 10.0.16.4, admin, and passw0rd). Also note that the database port is not specified in the manifest at all!

It's unfortunate that I have to specifify the IP at all, since now this manifest may not work in your environment if you happen to have a different IP range. And it's unfortunate that the username and password are in this manifest, both because it creates busy-work for us to generate them and put them in the manifest, and it's really bad from a security perspective to have these plainly visible in the manifest. We'll address these points later, but let's first answer some of the questions raised earlier.

About those magic strings...

We see the string database in a ton of places. The web_server job spec consumes a link with name: database and type: database. One of the web_server's templates references link("database"). The database job itself has name: database (at the top level), and it also provides a link with name: database and type: database. Finally, the manifest has an instance group with name: database, and that instance group has a job with name: database. Let's adjust the example files so that only the strings that truly need to match do so, we'll give things contrived names to clarify their purpose (and we'll remove the <uninteresting> bits):

~/workspace/releases/app/jobs/web_server_process/spec

---
name: web_server_process

templates:
  web_server_process_ctl.sh.erb: bin/web_server_process_ctl
  config.json.erb: config.json

packages:
  - web_server_binary

# DEPENDENCIES DECLARED IN THIS SECTION
consumes:
- {name: local_database_data_object_reference, type: database_data_class}

properties:
  greeting:
    description: Greeting to display to the user when they reach the landing page
    default: Hello, user!
  port:
    description: Port that the web server binds to
    default: 8080

~/workspace/releases/app/jobs/web_server_process/templates/config.json.erb

{
  "database": {
    "address": "<%= link("local_database_data_object_reference").instances[0].address %>",
    "port": <%= link("local_database_data_object_reference").p("port") %>,
    "username": "<%= link("local_database_data_object_reference").p("username") %>",
    "password": "<%= link("local_database_data_object_reference").p("password") %>"
  },
  "greeting": "<%= p("greeting") %>",
  "port": <%= p("port") %>
}

~/workspace/releases/app/jobs/database_process/spec

---
name: database_process

templates:
  database_process_ctl.sh.erb: bin/database_process_ctl
  config.json.erb: config.json

packages:
  - database_binary

# EXPORTED DATA DECLARED IN THIS SECTION
provides:
- {name: relevant_later, type: database_data_class, properties: [port, username, password]}

properties:
  port:
    description: Port that the database binds to
    default: 5432
  username:
    description: Username used to access the database
  password:
    description: Password used to access the database
  max_connections:
    description: Maximum number of database connections
    default: 500

~/workspace/deployments/galileo_environment/my_app.yml

---
# <uninteresting></uninteresting> bits removed for clarity!

name: my_app

releases:
- name: app
  version: latest

instance_groups:
- name: web_server_instances
  jobs:
  - release: app
    name: web_server_process
    properties:
      greeting: "Namaste!"

- name: database_instance
  networks: [{name: default, static_ips: [10.0.16.4]}]
  jobs:
  - release: app
    name: database_process
    properties:
      username: admin
      password: passw0rd

So now we have:

The deployment manifest has an instance group called database_instance, which represents the database instance that will be deployed to the cloud.
That database instance runs one job, the database_process, which refers to the BOSH job database_process in the app release.
The database_process job exports or provides some data, and the "type" of data it exports is called database_data_class. The name of the exported data is relevant_later because it's irrelevant for now.
The web_server_process consumes data of type database_data_class. This is essentially where it declares the dependency it needs injected. I used the class suffix to hint at that. Indeed, the fact this is under type: database_data_class, the type part is sort of a hint to the fact that it acts like a class name that you're dependency-injecting. On the other hand, this type key is very unfortunately named, and may be better to think of as a "tag" or "label" that simply used for matching, there's no underlying "type system" that makes sure your data "compiles", there are no "standard library" types like "URL" or "RSAKeyPair" that you might imagine would be common link types, they're just arbitrary, meaningless strings.
The web_server_process names the link it consumes local_database_data_object_reference, and then at template-render time, this reference is used to actually access the data its depending on, which we see in the config.json.erb as link("local_database_data_object_reference").

About that `link("...")` object, and its `.instances[0]`, and the `.address` accessor

Notice that we only had to specify the IP on the database_instance (as 10.0.16.4), nowhere in the manifest under the web_server_instances did we include this IP. When a job consumes a link, it gets a bunch of data for free about the instances (in this case database_instance) that are running the job (in this case database_process) which provide the link it consumes. NOTE: the web server needs to know not only details specific to how the database process is configured (e.g. username and password), but also some infrastructure-y/network-y data about the VM(s) this process is running on, (e.g. address).

link("local_database_data_object_reference").instances is an array of objects representing instances/VMs on which the job-providing-the-database_data_class-link is running. We only have one such instance (see instances: 1 in the manifest), hence link("local_database_data_object_reference").instances[0]. This object has the data about that VM, including its address, accessed via the .address accessor. If our web server were designed to talk to some HA database that had multiple instances, each with their own address, you might see something like this instead:

{
  "database": {
    "addresses": <%= link("local_database_data_object_reference").instances.map(&:address) %>, 
    ...
  }
}

There's all sorts of other data you get for free about the instances, see the "Available instance object methods" here.

The `.p(...)` method on the `link(...)` object

There's a bunch of job-configuration-specific data which you don't get "for free". However, if a job providing a link excplicitly exports certain properties, then consuming jobs can access the values. Note that while the database_process job has a property called max_connections declared in its job spec, that isn't named amongst the properties provided in the relevant_later link. That's because it's an internal detail that something talking to this database doesn't really need to know about. It does export the port, username, and password properties.

From the database_process perspective, the port value comes from the default 5432, whereas the username and password are specified in the manifest where the database_process job itself is configured. From the web_server_process perspective, these properties are all accessed in the job templates, via the p method on the link("local_database_data_object_reference") object. This behaves similar to the top-level p method in the template rendering context. p("greeting") is used to extract the value of the greeting property defined directly in the web_server_process job spec, and ...p("port") is used to grab the port data off of the local_database_data_object_reference link.

Aside: It's worth highlighting that this can definitely seem a bit awkward at first. A job specifies that it provides a certain link, and then you have a job that can consume that link. When the job consumes the link(...) object, it gets access to some of the property configuration data of the providing job via link(...).p(...) which makes sense -- the job declares that it's providing a link, and thus I can access properties of that job. But I can also get data about the VMs (or more generally, instances) running that job at deploy time via the link(...).instances array.

The instance group doesn't declare explicitly that it's providing any link, but the instance group happens to run a job, and that job provides the link, and so I get access to necessary information about the instances as well. The abstraction may seem a little funny, but you definitely need things like the IP and port of a database you depend on -- the port is something the database job or process care about, the IP is something that has more to do with the VM or instance on which that database process is running. So link(...) provides both of these kinds of data.

Break Time

4. ADVANCED: But what about...

Backwards compatibility with existing job templates, job specs, and deployment manifests

Before links, the web_server_process job spec would have had a bigger properties section that might have looked like this:

properties:
  greeting:
    description: Greeting to display to the user when they reach the landing page
    default: Hello, user!
  port:
    description: Port that the web server binds to
    default: 8080
  database.address:
    description: Address of the database used to back this web server
  database.port:
    description: Port of the database used to back this web server
  database.username:
    description: Username used to access the backing database
  database.password:
    description: Password used to access the backing database

and the template might have looked like this:

~/workspace/releases/app/jobs/web_server_process/templates/config.json.erb

{
  "database": {
    "address": "<%= p("database.address") %>",
    "port": <%= p("database.port") %>,
    "username": "<%= p("database.username") %>",
    "password": "<%= p("database.password") %>"
  },
  "greeting": "<%= p("greeting") %>",
  "port": <%= p("port") %>
}

The unfortunate thing in this world is the manifest would have had to have a lot more duplicated junk, you'd typically see things like this:

~/workspace/deployments/galileo_environment/my_app_without_links.yml

---
name: my_app_without_links

releases:
- name: app
  version: latest

instance_groups:
- name: web_server_instances
  jobs:
  - release: app
    name: web_server_process

- name: database_instance
  networks: [{name: default, static_ips: [10.0.16.4]}]
  jobs:
  - release: app
    name: database_process

properties:
  # For web server
  greeting: "Namaste!"
  database:
    address: 10.0.16.4
    username: admin
    password: passw0rd
    port: 5432
  
  # For database
  username: admin
  password: passw0rd
  port: 5432

I have to specify the IP, username, and password twice. The database port defaults to 5432 but without links, the web_server_process job needs to be configured with that data explicitly. For sanity, to avoid coupling the database.port property provided to the web_server_process in the deployment manifest with the port property's default value in the database_process job spec, it's best to also explicitly set the database_process's port value in the manifest to match the value given to the web_server_process in the manifest to make sure they truly match, even if you happen to provide the same value that the database_process job spec happens to default to currently. So there's actually triplication going on there, not mere duplication.

Well, links are nice to avoid these issues, but how do I support both ways of configuring my jobs for compatibility reasons? In the short term, you might want to express logic like, "if the old-style properties are provided, honour those; if not, but something else in my deployment is providing the data via a link I can consume, then use that data". Check out how the NATS smoke-tests job determines the port of the nats server job here:

releases/nats-release/jobs/smoke-tests/templates/config.json.erb

<%
  ...
  
  nats_port = nil
  if_p("nats.port") { |port| nats_port = port }
  unless nats_port
    nats_port = link("nats").p("nats.port")
  end
%>

{
  ...
  "Port": <%= nats_port %>
}

The key is the usage of the if_p helper function.

The logic above "prefers" the "old-style" properties, and only if they're not present, looks for the link. For compatbility with older BOSH Directors, you want this logic because they won't know anything about the link method, and you don't want the Director to reach that line of code at template-render time. Operators are recommended to upgrade their Directors to a version that supports Links (which have been GA for several months now), at which point one can invert the logic to prefer links over "old-style" properties.

In the medium-term future, start to use the if_link helper and optional: true when specifying Links consumed by your release jobs, so that your ERB templates look more like this:

<%
  ...
  
  nats_port = nil
  if_link("nats") { |nats_link| nats_port = nats_link.p("nats.port") }
  unless nats_port
    nats_port = p("nats.port")
  end
%>

{
  ...
  "Port": <%= nats_port %>
}

In the long-term, one can drop support for "old-style" properties altogether. Worst case, the link values can be specified explicitly in the manifest (just as properties are today, but in a different location).

What happens if multiple things provide the same type of link?

In the basic example, the name of the link being provided wasn't relevant, hence the name relevant_later. BOSH knows to connect the data coming from the database_process to the web_server_process implicitly based on the type of the link, namely database_data_class. But if your manifest happened to have two jobs, in two instances groups (or the same for that matter) which both happened to provide links of the same database_data_class type, how would BOSH disambiguate? By default, it won't, and you'll get a nice error message when you try to bosh deploy, for example:

- Multiple instance groups provide links of type 'database_data_class'. Cannot decide which one to use for instance group 'web_server_instances'.
   my_app.database_instance.database_process.relevant_later
   my_app.another_database_instance.database_process.relevant_later

The types of the consumed and provided links are specified in the BOSH releases, but if you're an operator who only has control of the BOSH manifest, you need some way in the manifest to disambiguate.

Enter as and from. Think of this as being analogous to import renaming in code, e.g.

foo.py

import numpy as np

...

or a Golang analogy:

main.go

package main

import (
  vaultapi "github.com/hashicorp/vault/api"
  consulapi "github.com/hashicorp/consul/api"
)

Here's a contrived example where there are two copies of our web app, one in Hindi and one in French. Each one is backed by a different database (because the database stores the number of views, and we want to see which app is more popular so we want view counts stored in separate databases).

~/workspace/deployments/galileo_environment/my_app_a_b_testing.yml

---
name: my_app_a_b_testing

releases:
- name: app
  version: latest

instance_groups:
- name: hindi_web_server_instances
  jobs:
  - release: app
    name: web_server_process
    properties:
      greeting: "Namaste!"
    consumes:
      local_database_data_object_reference: {from: hindi_database}

- name: french_web_server_instances
  jobs:
  - release: app
    name: web_server_process
    properties:
      greeting: "Bonjour!"
    consumes:
      local_database_data_object_reference: {from: french_database}

- name: hindi_database_instance
  networks: [{name: default, static_ips: [10.0.16.4]}]
  jobs:
  - release: app
    name: database_process
    properties:
      username: admin
      password: surya44chand
    provides:
      relevant_later: {as: hindi_database}

- name: french_database_instance
  networks: [{name: default, static_ips: [10.0.16.5]}]
  jobs:
  - release: app
    name: database_process
    properties:
      username: admin
      password: soleil44lune
    provides:
      relevant_later: {as: french_database}

There's two distinct copies of the database_process job, one in the french_database_instance and one in the hindi_database_instance. They both provide a link of type database_data_class, so BOSH lets you disambiguate the two sets of link data by saying that, for the purpose of this manifest, they will each be known as something else, something unique. One bunch of database_data_class link data shall be known as hindi_database, and the other as french_database. There happen to be two jobs wanting to consume data of type database_data_class, but they want to consume different instances of that data. Locally, they want to reference those consumed links as local_database_data_object_reference, but the French web server is going to get that from french_database and the Hindi web server from the hindi_database.

What if the link types don't match?

You might not have your database and web server in the same release. You might consume the database from some release made by a different team or community. What are the chances they happened to provide a link with type database_data_class? There's no central authority ensuring the consistency of that magic string across release authors. as and from can help with this too. You might want to use a different release which provides a database, and its job provides a link with a much less contrived type than database_data_class, maybe its type is database (like it was before we made the basic example contrived) and its name is postgres (as opposed to relevant_later).

You could do this:

~/workspace/deployments/galileo_environment/my_app_with_other_database.yml

---
name: my_app_with_other_database

releases:
- name: app
  version: latest
- name: other_database # NOTE: we're going to use a database job from a different release
  version: latest

instance_groups:
- name: web_server_instances
  jobs:
  - release: app
    name: web_server_process
    properties:
      greeting: "Namaste!"
    consumes:
      local_database_data_object_reference: {from: explicit_db_link_name}

- name: database_instance
  networks: [{name: default, static_ips: [10.0.16.4]}]
  jobs:
  - release: other_database
    name: database
    properties:
      username: admin
      password: surya44chand
    provides:
      postgres: {as: explicit_db_link_name}

Gotcha: It's more than the link types which need to match, the properties provided by the link, and then consumed via link(...).p(...) need to match. In general, that's not going to be the case. The link ecosystem is very young, and there are no established conventions. Compare for instance the properties provided by the Cloud Foundry community's postgres-boshrelease here to Cloud Foundry's cf-mysql-release here. The web server can't truly swap out alternative implementations of the database_data_class link implementation. Maybe someday!

What about external data from something not managed by BOSH, like an RDS database?

BOSH has no way to know the address, port, or credentials of your RDS database, so you'll have to manually provide this link data. If your release has done the work to make itself backwards compatible with old manifests, you might be tempted to provide this data in the traditional properties sections at the top level of the manifest, or under a particular job listed within an instance group. A better thing to do would be to put all the data in the links whenever possible, so that you can eventually drop support for non-link data. That way, you only have one source of data to worry about and can eventually drop the if_p branching for backwards compatibility in your release templates. You can explicitly spell out what data a job consumes for one of its links directly in the manifest. It's just as desribed on the BOSH docs page on Manual Linking:

~/workspace/deployments/galileo_environment/my_app_with_rds.yml

---
name: my_app_with_rds

releases:
- name: app
  version: latest

instance_groups:
- name: web_server_instances
  jobs:
  - release: app
    name: web_server_process
    properties:
      greeting: "Namaste!"
    consumes:
      local_database_data_object_reference:
        instances:
        - address: teswfbquts.cabsfabuo7yr.us-east-1.rds.amazonaws.com
        properties:
          port: 3306
          username: some-username
          password: some-password
          name: my-app

5. ADVANCED: More features

Thank you webmaster! But our `database_process` is in another deployment!

What if you want your database to be managed by your DBA, and someone else manages the web server? You might have your database_process and your web_server_process in different deployments, managed by different "BOSH teams" leveraging the new BOSH RBAC features available via the BOSH+UAA integration. You can still link data across deployments:

~/workspace/deployments/galileo_environment/standalone_database.yml

---
name: standalone_database # <== NOTE THIS DEPLOYMENT NAME

releases:
- name: app
  version: latest

instance_groups:
- name: database_instance
  networks: [{name: default, static_ips: [10.0.16.4]}]
  jobs:
  - release: app
    name: database
    properties:
      username: admin
      password: surya44chand
    provides:
      relevant_later:
        as: explicit_db_link_name
        shared: true

~/workspace/deployments/galileo_environment/standalone_webserver.yml

---
name: standalone_webserver

releases:
- name: app
  version: latest

instance_groups:
- name: web_server_instances
  jobs:
  - release: app
    name: web_server_process
    properties:
      greeting: "Namaste!"
    consumes:
      local_database_data_object_reference:
        from: explicit_db_link_name
        deployment: standalone_database # <== DEPLOYMENT NAME REFERENCED HERE

A few things:

To expose data to other deployments, the link being provided needs to have shared: true set
"Implicit linking" (where things are matched on the type defined in the job spec) doesn't work across deployments. Therefore, you must have as set on the link being provided and from on the link being consumed, and the values of these things needs to match across deployment manifests, as you can see in the above two manifest examples. This is the magic string that glues the data together across deployments.
The consuming link must also specify the name of the deployment it is consuming from, in case multiple deployments are providing a link with the same as value.

Self-linking for clustering

Clustered jobs like etcd and Consul often need to know the addresses of their peers. This can be accomplished when a job consumes a link which it itself provides. This can be a bit awkward, but it works. See how the nats job in the nats-release specifies it in the job spec and then leverages it in one of its templates.

6. ADVANCED: Links++

We started talking about all the things we didn't want to specify in the manifest. Most go away automatically with Links, but relying on some other new BOSH features, we can also get rid of having to choose that static IP, as well as the username/password for the database.

The manifest examples above have already removed all the network and resource pool configuration, so you know it's using Cloud Config and hence Global Networking. This means BOSH can actually pick static IPs for you. That is to say, we never really needed to set the static_ips in the above manifests in the first place!

Here, Links + Global Networking are working together. Since web_server_process depended on IPs of database_process, you'd (A) need to explicitly configure web_server_process in the manifest by providing it the IPs of database_process. And to make sure database_process actually has those IPs you told web_server_process, you'd (B) need to explicitly set static IPs for database_process. But since web_server_process's job templates can get the information via link(...).instances.map(&:address), then (A) goes away. And now you have reason to be explicit about what IPs database_process gets, and BOSH is able to pick the IPs for you, so (B) goes away. One of these features alone isn't enough to get the IPs out of the manifest, but together they are.

The username and password removal requires a bit more explanation, and there's a bit more work required in BOSH and CredHub to make things really seamless. There's actually a lot more junk in the manifest that isn't directly or indirectly related to links that would be nice to remove (e.g. Do I really need to specify azs or can it automatically stripe my job across all available ones? Could there be a default network specified in the Cloud Config so I don't need to keep specifying it in the instance groups?). A manifest might eventually look like this:

~/workspace/deployments/galileo_environment/my_app_future.yml

---
# DON'T NEED TO SPECIFY NAME, IT GETS SPECIFIED ON THE COMMAND LINE WHEN I 'bosh deploy my_app_future ...'

releases:
- name: app
  version: latest

instance_groups:
- name: web_server_instances
  jobs:
  - release: app
    name: web_server_process
    properties:
      greeting: "Namaste!"
  instances: 3
  vm:
    memory: 2GB
    cpu: 4
    ephemeral_disk: 20GB

- name: database_instance
  jobs:
  - release: app
    name: database_process
    properties:
      username: ((db_username)) # CREDHUB WILL GENERATE A USERNAME AND STORE IT INTERNALLY 
                                # WITH A NAMESPACED REFERENCE LIKE /my_bosh/my_app_future/db_username
      password: ((db_password)) # SIMILAR TO ABOVE
  instances: 1
  azs: [z1]
  vm:
    memory: 2GB
    cpu: 16
    ephemeral_disk: 20GB
  persistent_disk: 500GB
  
update:
  canaries: 1
  canary_watch_time: 10000-600000
  update_watch_time: 10000-600000
  max_in_flight: 1
  serial: true

stemcells:
- alias: whatever
  default: true
  os: ubuntu-trusty
  version: latest

7. Action items for release authors

There are nearly enough features in BOSH today to manage a foundation consisting of multiple deployments such as Cloud Foundry, MySQL, RabbitMQ, Redis, etc., with:

no IPs specified in any manifest;
no credentials specified in any manifest;
all internal "uninteresting" credentials automatically generated by CredHub (not by the operator using certrap, openssl, etc.);
no error-prone repetition of the same values in multiple places;

Furthermore, the existing BOSH features allow the same set of manifests used for one foundation to be used in a different foundation without the headache of making a ton of environment-specific adjustments. And going even further, existing BOSH features support a level of extensibility so that a vendor-specific distribution of Cloud Foundry, etc. can make heavy reuse of open source distributions such as cf-deployment. The next major milestones require release authors to build the little bits of glue that tie everything together. For 99% of releases out there, linkifying your individual release is cheap and easy, though it's most impactful when it's done across the board. If one release linkifies but most others don't, the operator doesn't see any big wins, e.g. instead of having to hand-maintain 100 IPs, they get to hand-maintain 99.

Be fruitful and linkify!

jpluscplusm/bosh-links-why-and-how.md