data-job
DataJob CRD, GVK: data.demo.orkestra.io/v1alpha1, Kind=DataJob
Overview
data-job is a custom resource managed by the
data-platform operator running on Orkestra.
Resources of this type are namespace-scoped.
| Field | Value |
|---|---|
| API Version | data.demo.orkestra.io/v1alpha1 |
| Kind | DataJob |
| GVR (plural) | data.demo.orkestra.io/v1alpha1, Resource=datajobs |
| Scope | Namespaced |
Reconcile Mode
This CRD runs in dynamic mode. Orkestra works directly with the raw unstructured Kubernetes object — no Go code is required. Reconcile logic is expressed declaratively in the Katalog YAML using template expressions like {{ .spec.field }}.
The Generic Reconciler manages the full CR lifecycle: it ensures managed labels are set on every reconcile, adds and removes finalizers, runs onCreate and onReconcile template blocks, emits events, increments metrics, and reports health status.
Configuration
The operator maintains 3 worker goroutines to process reconcile events concurrently. Each worker dequeues one CR key at a time, reconciles it, and returns. The queue has a maximum depth of 100 events.
Orkestra resyncs all managed resources every 15s by re-enqueueing every CR key. This ensures drift caused by external changes is corrected even without a Kubernetes watch event.
Child Resources
When the operator reconciles a data-job instance it creates and manages
the following Kubernetes resources on its behalf. These are owned by the CR via owner references
and are deleted automatically when the CR is deleted (unless deletion protection is active).
Resources listed under onCreate are created on the first reconcile. Resources listed under onReconcile are re-applied on every reconcile cycle. A resource appearing in both phases is created once and kept in sync thereafter.
| Kind | Count | Lifecycle phases |
|---|---|---|
ConfigMap |
1 | onCreate |
ReplicaSet |
1 | onReconcile |
To see the actual child resources created for a running instance, navigate to the instance's detail page from the data-platform control panel.
kubectl Reference
Use the commands below to interact with data-job resources from the command line.
List resources
kubectl get datajob -n <namespace>
Describe a resource
kubectl describe datajob <name> -n <namespace>
Get YAML
kubectl get datajob <name> -n <namespace> -o yaml
Watch for changes
kubectl get datajob -n <namespace> -w
Delete a resource
kubectl delete datajob <name> -n <namespace>
Filter by Orkestra managed label
kubectl get datajob -l orkestra.orkspace.io/managed=true -n <namespace>
Access Control
The operator holds the following RBAC permissions to manage data-job resources.
1 configmaps, 1 datajobs, 1 datajobs/status
| API Groups | Resources | Verbs |
|---|---|---|
data.demo.orkestra.io |
datajobs |
get list watch create update patch delete |
data.demo.orkestra.io |
datajobs/status |
get update patch |
core |
configmaps |
get list watch create update patch delete |