- aws cli installed
- aws credentials configured
- Installed session manager plugin
brew install --cask session-manager-plugin
Note: Templates
and Aliases
will only go via fetch
command only. So if u re-run fetch it will check if indices exist on target it wont copy it.
1. Add below policy to admin user
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "ssm:StartSession",
"Resource": [
"arn:aws:ec2:us-east-1:147228461610:instance/i-07bb142a4e1d1015f",
"arn:aws:ssm:us-east-1:147228461610:document/SSM-dev-BootstrapShell"
]
}
]
}
2. Check the EC2 instnce (optional)
aws ec2 describe-instances --instance-ids i-07bb142a4e1d1015f --query 'Reservations[].Instances[].[InstanceId, State.Name, PrivateIpAddress, InstanceType, VpcId, SubnetId, IamInstanceProfile.Arn]' --output text
output:
i-07bb142a4e1d1015f running 10.0.1.233 t2.large vpc-027d21c529ff47624 subnet-03e74a323742dee98 arn:aws:iam::147228461610:instance-profile/migration-assistant-BootstrapEC2InstanceInstanceProfile987ED32E-8mOHqiIafFF9
3. Access bootstrap instance
aws ssm start-session --document-name SSM-dev-BootstrapShell --target i-07bb142a4e1d1015f --region us-east-1
4.Run script to prepare the bootstrap instance for deploying the migration pieces
Wait 5-10 minutes to finsh
./initBootstrap.sh && cd deployment/cdk/opensearch-service-migration
This will create couple of script files such as accessContainer.sh
script to connect to the migration console
in ECS container.
export AWS_DEFAULT_REGION=us-east-1
# This will take 1-2 mins
cdk bootstrap --c contextId=demo-deploy
Check what are we going to deploy in this demo (optional)
cdk ls "*" --c contextId=demo-deploy --require-approval never --concurrency 3
Expected output:
App Registry mode is enabled for CFN stack tracking. Will attempt to import the App Registry application from the MIGRATIONS_APP_REGISTRY_ARN env variable of arn:aws:servicecatalog:us-east-1:147228461610:/applications/0a62vr23uwjs7kbbj3q08u4gof and looking in the configured region of us-east-1
Received following context block for deployment:
{
stage: 'dev',
engineVersion: 'OS_2.9',
domainName: 'demo-opensearch-cluster',
dataNodeCount: 2,
availabilityZoneCount: 2,
openAccessPolicyEnabled: true,
domainRemovalPolicy: 'DESTROY',
enableDemoAdmin: true,
trafficReplayerEnableClusterFGACAuth: true,
captureProxyESServiceEnabled: true,
fetchMigrationEnabled: true,
sourceClusterEndpoint: 'https://capture-proxy-es.migration.dev.local:19200'
}
End of context block.
networkStack-default (OSMigrations-dev-us-east-1-default-NetworkInfra)
openSearchDomainStack-default (OSMigrations-dev-us-east-1-default-OpenSearchDomain)
migrationInfraStack (OSMigrations-dev-us-east-1-MigrationInfra)
mskUtilityStack (OSMigrations-dev-us-east-1-MSKUtility)
analyticsDomainStack (OSMigrations-dev-us-east-1-AnalyticsDomain)
migration-analytics (OSMigrations-dev-us-east-1-MigrationAnalytics)
fetchMigrationStack (OSMigrations-dev-us-east-1-FetchMigration)
capture-proxy-es (OSMigrations-dev-us-east-1-CaptureProxyES)
traffic-replayer-default (OSMigrations-dev-us-east-1-default-TrafficReplayer)
migration-console (OSMigrations-dev-us-east-1-MigrationConsole)
Deploy demo setup:
# Deploy demo setup, This may take upto 1 hour
cdk deploy "*" --c contextId=demo-deploy --require-approval never --concurrency 3
When the stack is deployed then enable monitoring.
Activate AWS Cost Explorer
Confirm cost tags associated with the solution
Before running this make sure the Stack creation is complete state CREATE_COMPLETE
.
aws cloudformation describe-stacks --stack-name OSMigrations-dev-us-east-1-MSKUtility --region us-east-1 --query 'Stacks[0].StackStatus'
Now add the tag
Activate cost allocation tags associated with the solution
This demo will deploy 4 ECS clusters, one of them is migration-dev-capture-proxy-es
here you will see ES 7.10 is already created with no data in it. Also in the same ECS service you will see Capture Proxy
is deployed and running too. So Capture Proxy
is listnening to port 9200 and ES 7.10 is listening to 19200. This setup is already done. If you want to deploy Capture Proxy
in your own self-hosted ES server follow the steps here.
-
Check All 4 ECS clusters are created in the region where you deployed.
- migration-dev-traffic-replayer-default
- migration-dev-migration-console
- migration-dev-capture-proxy-es
- migration-dev-otel-collector
-
Check EC2
bootstrap-dev-instance
is created
- Check MSK is created
We will do 2 demos:
- Migrating historical data ES to AOS
- Migrating live data from ES to AOS
# 1. Open Local Terminal and Authenticate
IAM Identity Center copy and execute AWS environment variables
# 2. Execute below to connect to bootstrap instance
aws ssm start-session --document-name SSM-dev-BootstrapShell --target i-07bb142a4e1d1015f --region us-east-1
# 3. Navigate to deployment folder to access scripts
cd deployment/cdk/opensearch-service-migration/
# 3. Execute below from bootstrap instance, connect to migration console terminal ./accessContainer.sh migration-console STAGE REGION
./accessContainer.sh migration-console dev us-east-1
Modify the runTestBenchmarks.sh
script to target port 19200
and index test data into the ES cluster.
- Update script
vi runTestBenchmarks.sh
- Run benchmark script, to populate couple of indices in ES 7.10
./runTestBenchmarks.sh
- Check for new indices in the ES7.10 Cluster
vi catIndices.sh
Run Fetch command from console terminal
# This will execute the script and print the required ECS run task command
./showFetchMigrationCommand.sh
expected output:
aws ecs run-task --task-definition arn:aws:ecs:us-east-1:147228461610:task-definition/migration-dev-fetch-migration:1 --cluster migration-dev-ecs-cluster --launch-type FARGATE --network-configuration '{"awsvpcConfiguration":{"subnets":["subnet-05643745bbd732274","subnet-03bfe3b54d1f6b4fc"],"securityGroups":["sg-08a49f1b54cb7ec43","sg-079bfe228c2bee4ae"]}}'
Run the command above to initiate the fetch migration, which will launch an ECS task in a new container to execute the data preparation pipeline, transferring data directly from the source ES cluster to Amazon OpenSearch without involving MSK or Capture Proxy.
It will take some time to migrate all data, based on the size. Confirm it migrated all indices, templates and aliases.
Here's the complete and updated list of all index names shown in the image along with their revised document counts, including updates and additions:
- logs-221998 - Document Count: 1000
- geonames - Document Count: 1000
- nyc_taxis - Document Count: 1000
- sonested - Document Count: 2977
- logs-241998 - Document Count: 1000
- sg7-auditlog-2024.04.13 - Document Count: 3
- sg7-auditlog-2024.04.14 - Document Count: 3
- logs-201998 - Document Count: 1000
- sg7-auditlog-2024.04.16 - Document Count: 3
- .kibana_1 - Document Count: 0
- searchguard - Document Count: 0
- 2024-04-15 - Document Count: 8353
- .opensearch-observability - Document Count: 0
- .plugins-ml-config - Document Count: 1
- logs-211998 - Document Count: 1000
- logs-231998 - Document Count: 1000
- logs-191998 - Document Count: 1000
- 2024-04-16 - Document Count: 3282
This list accurately reflects the updated document counts for each index based on the provided corrections and additions.
Note: data in kafka is not deleted after replayer
Same traffic pattern in replayer as source.
source ---|-------|----|
replayer |-------|----|
The marble diagram illustrates that the source emits events at intervals after 3, 7, and 4 units of time, while the replayer starts immediately after the first event and follows with identical intervals.
This behavior ensures the target cluster can handle the same traffic patterns as the source, a capability not achievable with standard benchmarking tools. Additionally, the replayer is engineered to replicate these patterns across multiple targets, allowing for thorough testing of performance differences.
Send real-time data to ES 7.10 cluster, Data will go to Capture proxy, MSK and ES 7.10 that's it.
Execute below to send real-time data to ES 7.10
# Run in background
nohup python3 ./live-data.py --endpoint https://capture-proxy-es:9200 &
Execute below to start traffic replayer
aws ecs update-service --cluster migration-dev-ecs-cluster --service migration-dev-traffic-replayer-default --desired-count 1
Confirm migration is happening
./stats.sh
2024-04-19
index only created with 879
docs in ES7.10
Run below to start traffic replayer to send data to AOS:
aws ecs update-service --cluster migration-dev-ecs-cluster --service migration-dev-traffic-replayer-default --desired-count 1
Check the count of the docs in target cluster (AOS).
curl https://capture-proxy-es:9200/_cat/templates?v -u admin:admin --insecure
curl -X PUT "https://capture-proxy-es:19200/_template/template_1?pretty" -u admin:admin --insecure -H 'Content-Type: application/json' -d'
{
"index_patterns": ["te*", "bar*"],
"settings": {
"number_of_shards": 1
},
"mappings": {
"_source": {
"enabled": false
},
"properties": {
"host_name": {
"type": "keyword"
},
"created_at": {
"type": "date",
"format": "EEE MMM dd HH:mm:ss Z yyyy"
}
}
}
}
'
curl -X PUT "https://capture-proxy-es:19200/logs-221998/_alias/alias1?pretty" -u admin:admin --insecure
Check the count of payload in the kafka topic. If there is any events in the MSK topic logging-traffic-topic
./kafka-tools/kafka/bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list "$MIGRATION_KAFKA_BROKER_ENDPOINTS" --topic logging-traffic-topic --time -1 --command-config kafka-tools/aws/msk-iam-auth.properties
output:
logging-traffic-topic:0:9
0
partitions and 9 events
Show the progress of the traffic replayer how far it gone processsing of those records in the MSK. How many records was consumed from the kafka.
./kafka-tools/kafka/bin/kafka-consumer-groups.sh --bootstrap-server "$MIGRATION_KAFKA_BROKER_ENDPOINTS" --timeout 100000 --describe --group logging-group-default --command-config kafka-tools/aws/msk-iam-auth.properties
output:
GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
logging-group-default logging-traffic-topic 0 8 9 1 consumer-logging-group-default-1-261cc380-f44f-45b1-97d0-beb1e1a966d1 /10.0.1.202 consumer-logging-group-default-1
Let's stop traffic replayer and delete the kafka topic. Next we will index some data on 9200 port which will go via capture proxy and
Execute to stop traffic replayer:
aws ecs update-service --cluster migration-dev-ecs-cluster --service migration-dev-traffic-replayer-default --desired-count 0
Delete Kafka Topic named "logging-traffic-topic"
./kafka-tools/kafka/bin/kafka-topics.sh --bootstrap-server "$MIGRATION_KAFKA_BROKER_ENDPOINTS" --delete --topic logging-traffic-topic --command-config kafka-tools/aws/msk-iam-auth.properties
ps aux|grep python
kill -9 <id>
## Example
root@ip-10-0-1-144:~# ps aux|grep python
root 8407 1.4 1.2 29512 24892 pts/0 S 21:36 0:12 python3 ./live-data.py --endpoint https://capture-proxy-es:9200
root 8486 0.0 0.1 4024 1992 pts/0 S+ 21:51 0:00 grep --color=auto python
root@ip-10-0-1-144:~# kill -9 8407
printf '\e[?2004l'