func (a *UpgradeGatewayAPIV1) runCluster() http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
clientNodes, _ := connector.AgentClients(r.Context())
// filter out cluster delegate and non-delegate nodes
var clusterDelegateNode *catalog.Agent
var nonDelegateNodes []*catalog.Agent
for _, node := range clientNodes {| #!/bin/bash | |
| #/ Usage: ghe-storage-verify-mysql-migration <backup-path> <destination-path> | |
| #/ Verifies that two MySQL datadir trees are identical by comparing md5sums | |
| #/ of every file beneath each path. | |
| #/ | |
| #/ Example (after ghe-storage-migrate-mysql): | |
| #/ ghe-storage-verify-mysql-migration /data/user/mysql-backup/github_enterprise /data/multi-disk/db/mysql/github_enterprise | |
| set -e |
- https://github.com/github/ghes/issues/16434
- bundle: https://esbtools-staff.githubapp.com/bundles/191214 (no increase on resource)
- 69,336 active series with a total of 211.8 million datapoints across the ~48-hour window
β οΈ observability.metrics.prometheus-endpoint-enabled true
| Factor | Customer |
|---|---|
| Active series | 69,336 |
You've told us this was hard!!
Building new APIs in GHES Manage has too often felt like a challenging expedition: confusing paths, hidden "gotchas," and the occasional moment where you're just⦠banging your head against the desk wondering why something that should be straightforward is taking so long.
GHES Manage was created three years ago as a modern alternative to the legacy Enterprise Management Console, but adoption hasn't met our expectations. We've heard consistent feedback that implementing new API endpoints can be challenging and unintuitive.
So we've been focused on improving the GHES Manage API.
The following snippet helps to copy script from enterprise2 to a ghebooted instance:
#!/bin/bash
#/ Usage: ./copy.sh <ghe-boot-host-name>
TARGET_HOST="${1}"
TARGET_USER="admin"
TARGET_PORT="122"
set -e- Operator manage and configure instance with UI
- Operator builds automation with API
- Operator does fire fighting with ssh + bash script
- Part of the UI architecture and support a robust UI solution
- Help operators build an reliable and efficient automation
We initiated a systematic availability review process following our July 2024 offsite (see Revival of the GHES Availability Review Process). The first availability issue was then created on August 16th, marking almost a year since our previous review.
Our journey began by exploring what availability truly means for GHES. We recognized that an escalation's value extends beyond mere resolution - we aimed to foster deeper discussions, prevent recurrence through measured repair items, and share knowledge via comprehensive runbooks.
Over the past 6 months, we've made significant strides in our availability review processes:
- Created 29 availability issues
- Generated 29 repair items, with 22 successfully resolved
- Conducted 9 availability review meetings
You are an enterprise engineer!
Because GitHub Enterprise Server (GHES) drives our revenue and supports our largest and most recognizable clients, every engineer at GitHub, including yourself, is an enterprise engineer! This lab is an opportunity to practice a few of the concepts you'll need to test code that you write in a GHES environment.
As a prerequisite to this lab, you should watch each part of the Engineering for Enterprise Lecture(TODO). The lecture provides an overview of the tools and concepts that we will be practicing during this self-directed exercise. After watching the lecture, you should be familiar with the key concepts required to complete this lab:
| { | |
| "incidentStatusedTime": "2024-10-07T11:30", | |
| // "resolutionTime": "2024-10-31T04:50", // Dotcom specific | |
| // "visibility": "public", //Dotcom specific | |
| // "mostSignificantServiceStatus": "red",// Dotcom specific | |
| //"impactedServices": [],// Dotcom specific | |
| "resolvingIncidentCommander": "hubot", | |
| // "incidentUrl": "https://status-staging.githubapp.com/incidents/27863", // Dotcom specific | |
| // "impactStartTime": "2024-10-31T03:40", // Dotcom specific | |
| // "impactDetectionTime": "2024-10-31T03:40", // Dotcom specific |
In the past two weeks, we've held two Availability Review meetings featuring excellent presenters. These meetings facilitated fruitful discussions on how we can reflect and learn from customer incidents. (In case you missed any, you can find the recording for 08-27 and 09-04)π€
To enhance the efficiency of our AR meetings, here's a guide on the current Availability Review process and how AR issues should be completed. We're also integrating GHES-specific requirements into overall GitHub automations. Future improvements are expected to ease and eliminate more manual steps.π
| # | Step | Info |
|---|---|---|
| 1 | Availability Review created at end of GHES SEV 1 | this will be automated in future |
