Created
July 18, 2020 22:12
-
-
Save dzlab/101e8583683117c221262d9496f29447 to your computer and use it in GitHub Desktop.
TensorFlow Distributed Training on Kubeflow with TFJob
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Name: mnist-tensorflow-job | |
Namespace: default | |
Labels: <none> | |
Annotations: API Version: kubeflow.org/v1 | |
Kind: TFJob | |
Metadata: | |
Creation Timestamp: 2020-07-18T18:54:31Z | |
Generation: 1 | |
Resource Version: 43041 | |
Self Link: /apis/kubeflow.org/v1/namespaces/default/tfjobs/mnist-tensorflow-job | |
UID: 0b5b088f-0690-4089-8b18-1b4188eb345a | |
Spec: | |
Tf Replica Specs: | |
PS: | |
Replicas: 1 | |
Restart Policy: Never | |
Template: | |
Metadata: | |
Annotations: | |
sidecar.istio.io/inject: false | |
Spec: | |
Containers: | |
Image: docker.io/<DOCKER_HUB_USERNAME>/tf-dist-mnist-test:1.0 | |
Name: tensorflow | |
Worker: | |
Replicas: 2 | |
Restart Policy: Never | |
Template: | |
Metadata: | |
Annotations: | |
sidecar.istio.io/inject: false | |
Spec: | |
Containers: | |
Image: docker.io/<DOCKER_HUB_USERNAME>/tf-dist-mnist-test:1.0 | |
Name: tensorflow | |
Status: | |
Completion Time: 2020-07-18T18:56:16Z | |
Conditions: | |
Last Transition Time: 2020-07-18T18:54:31Z | |
Last Update Time: 2020-07-18T18:54:31Z | |
Message: TFJob mnist-tensorflow-job is created. | |
Reason: TFJobCreated | |
Status: True | |
Type: Created | |
Last Transition Time: 2020-07-18T18:54:36Z | |
Last Update Time: 2020-07-18T18:54:36Z | |
Message: TFJob mnist-tensorflow-job is running. | |
Reason: TFJobRunning | |
Status: False | |
Type: Running | |
Last Transition Time: 2020-07-18T18:56:16Z | |
Last Update Time: 2020-07-18T18:56:16Z | |
Message: TFJob mnist-tensorflow-job successfully completed. | |
Reason: TFJobSucceeded | |
Status: True | |
Type: Succeeded | |
Replica Statuses: | |
PS: | |
Succeeded: 1 | |
Worker: | |
Succeeded: 2 | |
Start Time: 2020-07-18T18:54:31Z | |
Events: | |
Type Reason Age From Message | |
---- ------ ---- ---- ------- | |
Normal SuccessfulCreatePod 5m6s tf-operator Created pod: mnist-tensorflow-job-worker-0 | |
Normal SuccessfulCreatePod 5m6s tf-operator Created pod: mnist-tensorflow-job-worker-1 | |
Normal SuccessfulCreateService 5m6s tf-operator Created service: mnist-tensorflow-job-worker-0 | |
Normal SuccessfulCreateService 5m6s tf-operator Created service: mnist-tensorflow-job-worker-1 | |
Normal SuccessfulCreatePod 5m5s tf-operator Created pod: mnist-tensorflow-job-ps-0 | |
Normal SuccessfulCreateService 5m5s tf-operator Created service: mnist-tensorflow-job-ps-0 | |
Normal ExitedWithCode 3m21s tf-operator Pod: default.mnist-tensorflow-job-worker-0 exited with code 0 | |
Normal TFJobSucceeded 3m21s tf-operator TFJob mnist-tensorflow-job successfully completed. | |
Normal SuccessfulDeletePod 3m21s tf-operator Deleted pod: mnist-tensorflow-job-worker-1 | |
Normal SuccessfulDeleteService 3m21s tf-operator Deleted service: mnist-tensorflow-job-worker-1 | |
Normal SuccessfulDeletePod 3m20s tf-operator Deleted pod: mnist-tensorflow-job-ps-0 | |
Normal SuccessfulDeleteService 3m20s tf-operator Deleted service: mnist-tensorflow-job-ps-0 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
apiVersion: "kubeflow.org/v1" | |
kind: "TFJob" | |
metadata: | |
name: "mnist-tensorflow-job" | |
spec: | |
tfReplicaSpecs: | |
PS: | |
replicas: 1 | |
restartPolicy: Never | |
template: | |
metadata: | |
annotations: | |
sidecar.istio.io/inject: "false" | |
spec: | |
containers: | |
- name: tensorflow | |
image: docker.io/<DOCKER_HUB_USERNAME>/tf-dist-mnist-test:1.0 | |
Worker: | |
replicas: 2 | |
restartPolicy: Never | |
template: | |
metadata: | |
annotations: | |
sidecar.istio.io/inject: "false" | |
spec: | |
containers: | |
- name: tensorflow | |
image: docker.io/<DOCKER_HUB_USERNAME>/tf-dist-mnist-test:1.0 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment