OpenShift Cheatsheet: Difference between revisions

From Linuxwiki
Jump to navigation Jump to search
Content deleted Content added
Sunflower (talk | contribs)
Sunflower (talk | contribs)
No edit summary
Line 2: Line 2:
Some helpful OpenShift commands which work (at least) since version >= 4.11
Some helpful OpenShift commands which work (at least) since version >= 4.11


updated for version: 4.19


= Login =
= Login =
Line 24: Line 25:


oc completion bash > /etc/profile.d/oc_completion_bash.sh
oc completion bash > /etc/profile.d/oc_completion_bash.sh

= Resources =
(in common)

== Common info ==
General cluster/resource info:
$ oc cluster-info

Which resources are there?
$ oc api-resources (--namespaced=false)(--api-group=config.openshift.io)(00api-group='')
(in|without namespace)(openshift specific)(core-api-group only)

Explain resources:
$ oc explain service

Describe resources:
$ oc describe service

Inspect resources:
$ oc adm inspect deployment XYZ --dest-dir /home/student/inspection
(Attention: control resulting files for secrets, passwords, privatekeys etc. before sending somewhere)
Get all resources:
$ oc get all

('''Attention:''' templates, secrets, configmaps and pvcs will be shown outside resources)

$ oc get template,secret,cm,pvc

List resources in context of another user/serviceaccount:
$ oc get persistentvolumeclaims -n openshift-monitoring --as=system:serviceaccount:openshift-monitoring:default

== Resources which are not shown with the "oc get all" command ==
$ oc api-resources --verbs=list --namespaced -o name | xargs -n 1 oc get --show-kind --ignore-not-found -n mynamespace

= Nodes =

Get status of all nodes:
$ oc get nodes

Get Logs of a node (and special unit)
$ oc adm node-logs <nodename> -u crio

Compare allocatable resources vs limits:
$ oc get nodes <nodename> -o jsonpath='{"Allocatable:\n"}{.status.allocatable}{"\n\n"}{"Capacity:\n"}{.status.capacity}{"\n"}'

Get resource consumption:
$ oc adm top nodes

Be careful !
Only the free memory is shown, not the allocatable memory. For a more realistic presentation do:
$ oc adm top nodes --show-capacity

( https://www.redhat.com/en/blog/using-oc-adm-top-to-monitor-memory-usage )

== Draining nodes ==
Empty node and put it into maintenance mode (e.g. before booting)
$ oc adm cordon <node1> (not necessary when you drain it s. below - will be emptied anyway)
$ oc adm drain <node1> --delete-emptydir-data=true --ignore-daemonsets=true

After reboot:
$ oc adm uncordon <node1>

= Machines =

Show Uptime:
$ oc get machines -A

Get state paused/not paused of machineconfigpool:
$ oc get mcp worker -o jsonpath='{.spec.paused}'

== Machinesets ==

Scale number of machines/nodes up/down:
$ oc scale --replicas=2 machineset <machineset> -n openshift-machine-api

== Delete and re-create machines/nodes ==
oc get machines -A | grep worker-<XY> | wc -l
-> MACHINECOUNT
oc annotate machine/<machine-name> -n openshift-machine-api machine.openshift.io/delete-machine="true"
oc scale --replicas=<$MACHINECOUNT+1> machineset <machineset> n openshift-machine-api
oc scale --replicas=$MACHINECOUNT machineset <machineset> -n openshift-machine-api

= Projects/Namespaces =

Switch '''namespace''':
$ oc project <namespace>
quit namespace:
$ oc project -n default


= Registries =
= Registries =
* registry.access.redhat.com (login only)
* registry.access.redhat.com
* registry.redhat.io
* registry.redhat.io (with login only)
* quay.io
* quay.io
* docker.io


= Image handling =
= Images =

Search Images by help of podman:
$ podman search <wordpress>


Look into images:
Look into images:
Line 45: Line 138:
When you have account to a registry:
When you have account to a registry:


skopeo login <registry>:8443 -u <username>
skopeo login <registry>:8443 -u <username>
skopeo inspect docker://registry.redhat.io:8443/ubi8/httpd-24:1-209
skopeo inspect docker://registry.redhat.io:8443/ubi8/httpd-24:1-209
skopeo inspect --config docker://registry.redhat.io/rhel8/httpd-24

Add the "latest" tag to a dedicated image:
Add the "latest" tag to a dedicated image:
skopeo copy docker://registry.redhat.io:8443/ubi8/httpd-24:1-215 docker://registry.redhat.io:8443/ubi8/httpd-24:latest
skopeo copy docker://registry.redhat.io:8443/ubi8/httpd-24:1-215 docker://registry.redhat.io:8443/ubi8/httpd-24:latest


== Create pod from image ==
= Creating =


$ skopeo login -u user -p password registry.redhat.io
$ skopeo login -u user -p password registry.redhat.io
Line 56: Line 151:
$ oc run <mypod-nginx> --image docker://docker.io/nginx:stable-alpine (--env NGINX_VERSION=1.24.1)
$ oc run <mypod-nginx> --image docker://docker.io/nginx:stable-alpine (--env NGINX_VERSION=1.24.1)


= Apps =
$ skopeo inspect (--config) docker://registry.redhat.io/rhel8/httpd-24

Search Images by help of podman:
$ podman search <wordpress>


== Create new app ==
== Create new app ==
Line 70: Line 162:
$ oc new-app -l team=blue '''--image''' registry.redhat.com/rhel9/mysql-80:1 -e MYSQL_ROOT_PASSWORD=redhat -e MYSQL_USER=developer -e MYSQL_PASSWORD=evenmoresecret
$ oc new-app -l team=blue '''--image''' registry.redhat.com/rhel9/mysql-80:1 -e MYSQL_ROOT_PASSWORD=redhat -e MYSQL_USER=developer -e MYSQL_PASSWORD=evenmoresecret


= Deployments =
=== Set environment variables afterwards ===

== Create Deployment from image ==
$ oc create deployment demo-pod --port 3306 --image registry.ocp.example.de:8443/rhel9/mysql-80

== Environment variables ==
Set environment variables on running deployment:
$ oc set env deployment/helloworld MYSQL_USER=user1 MYSQL_PASSWORD=f00bar MYSQL_DATABASE=testdb

oc set env deployment/mariadb MARIADB_DATABASE=wikidb
oc set env deployment/mariadb MARIADB_DATABASE=wikidb
oc set env deployment/mariadb MARIADB_USER=mediawiki
oc set env deployment/mariadb MARIADB_USER=mediawiki
Line 80: Line 180:
oc set env deployment/mariadb --from=secret/my-secret (--prefix=MYSQL_)
oc set env deployment/mariadb --from=secret/my-secret (--prefix=MYSQL_)


== Create Deployment from image ==
=== Restart deployment after change ===
$ oc create deployment demo-pod --port 3306 --image registry.ocp.example.de:8443/rhel9/mysql-80


$ oc rollout restart deployment testdeploy
=== Problem web server ===
In some images web servers run on port 80 which leads to permission problems in OpenShift as security context constraints do not allow to run apps on privileged ports


(obsolete:
Error message:
<br>the deployment resource has no rollout option -> You must patch something before it restarts e.g.:
<pre>
$ oc patch deployment testdeploy --patch "{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"last-restart\":\"`date +'%s'`\"}}}}}"
(13)Permission denied: AH00072: make_sock: could not bind to address [::]:80 (13)Permission denied: AH00072: make_sock: could not bind to address 0.0.0.0:80
)
</pre>

<br>-> either choose an image where port >= 1024 is used
<br>-> or add permissions to the corresponding service account

$ oc get pod <your pod name> -o yaml | grep -i serviceAccountName
serviceAccountName: default

$ oc adm policy add-scc-to-user anyuid -z default
(when you want to get rid of this setting again you have to edit the annotations field of the deployment and re-create the pod)

$ oc delete pod <your pod name>

== Create Job from image ==
$ oc create job testjob --image registry.ocp.example.de:8443/rhel9/mysql-80 -- /bin/bash -c "create database events; mysql events -e 'source /tmp/dump.sql;'"

Cronjob:
$oc create cronjob mynewjob --image registry.ocp4.example.de:8443/ubi8/ubi:latest --schedule='* * * * 5' -- /bin/bash -c "if [ $(date +%H) -gt 15 ]; then echo 'Hands up, weekend!'; fi"

Check output of job:
$ oc logs job/<name>

== Create Secret ==
=== from String ===
$ oc create secret generic <test> --from-literal=foo=bar

=== from file ===
$ oc create secret generic <sshkeys> --from-file id_rsa=/path-to/id_rsa --from-file id_rsa.pub=/path-to/id_rsa.pub

=== as TLS secret ===
$ oc create secret tls <secret-tls> --cert /tmp/mydomain.crt --key /tmp/mydomain.key

=== Update Secret ===
$ oc set data secret/<mysecret> --from-file /tmp/root-password

=== Extract secret ===
$ oc extract secret /<mysecret> --to /tmp/mysecret (--confirm)

== Create configmap from file ==
$ oc create configmap <mymap> --from-file=/tmp/dump.sql


== Add volume to deployment ==
== Add volume to deployment ==
Line 139: Line 198:


== Remove volume from deployment ==
== Remove volume from deployment ==

$ oc set volume deployment/file-sharing --remove --name=<vol-name>
$ oc set volume deployment/file-sharing --remove --name=<vol-name>


== Make new app available ==
== Make deployment available from inside/outside ==


=== Create service from deployment ===
=== Create service from deployment ===
Line 158: Line 216:
Afterwards the app is reachable from outside.
Afterwards the app is reachable from outside.


== Add probes ==
= Watching =
Configure readiness probe for deployment:
$ oc set probe deployment/<testdeploy> --readiness --failure-threshold 7 --get-url http://:3000/api/health


== Common info ==
== Autoscale Pods ==
$ oc autoscale deployment/test --min 2 --max 10 --cpu-percent 80
General cluster/resource info:
$ oc cluster-info


== Reduce/Upgrade cpu/mem requests ==
Which resources are there?
Reduce memory requests:
$ oc api-resources (--namespaced=false)(--api-group=config.openshift.io)(00api-group='')
$ oc set resources deployment/huge-mem --requests memory=250Mi
(in|without namespace)(openshift specific)(core-api-group only)


== Security ==
Explain resources:
=== Problem web server ===
$ oc explain service
In some images web servers run on port 80 which leads to permission problems in OpenShift as security context constraints do not allow to run apps on privileged ports


Error message:
Describe resources:
<pre>
$ oc describe service
(13)Permission denied: AH00072: make_sock: could not bind to address [::]:80 (13)Permission denied: AH00072: make_sock: could not bind to address 0.0.0.0:80
</pre>


<br>-> either choose an image where port >= 1024 is used
Inspect resources:
<br>-> or add permissions to the corresponding service account
$ oc adm inspect deployment XYZ --dest-dir /home/student/inspection
(Attention: control resulting files for secrets, passwords, privatekeys etc. before sending somewhere)


$ oc get pod <your pod name> -o yaml | grep -i serviceAccountName
Get all resources:
serviceAccountName: default
$ oc get all


$ oc adm policy add-scc-to-user anyuid -z default
('''Attention:''' templates, secrets, configmaps and pvcs will be shown outside resources)
(when you want to get rid of this setting again you have to edit the annotations field of the deployment and re-create the pod)


$ oc delete pod <your pod name>
$ oc get template,secret,cm,pvc

List resources in context of another user/serviceaccount:
$ oc get persistentvolumeclaims -n openshift-monitoring --as=system:serviceaccount:openshift-monitoring:default

== Resources which are not shown with the "oc get all" command ==
$ oc api-resources --verbs=list --namespaced -o name | xargs -n 1 oc get --show-kind --ignore-not-found -n mynamespace

== Nodes ==

Get status of all nodes:
$ oc get nodes

Get Logs of a node (and special unit)
$ oc adm node-logs <nodename> -u crio

Compare allocatable resources vs limits:
$ oc get nodes <nodename> -o jsonpath='{"Allocatable:\n"}{.status.allocatable}{"\n\n"}{"Capacity:\n"}{.status.capacity}{"\n"}'

Get resource consumption:
$ oc adm top nodes

Be careful !
Only the free memory is shown, not the allocatable memory. For a more realistic presentation do:
$ oc adm top nodes --show-capacity

( https://www.redhat.com/en/blog/using-oc-adm-top-to-monitor-memory-usage )

== Machines ==

Show Uptime:
$ oc get machines -A

Get state paused/not paused of machineconfigpool:
$ oc get mcp worker -o jsonpath='{.spec.paused}'


== Pods ==
= Pods =


Get resource consumption of all pods:
Get resource consumption of all pods:
Line 241: Line 269:
Copy file(s) to pod:
Copy file(s) to pod:
$ oc cp mysqldump.sql mysql-server:/tmp
$ oc cp mysqldump.sql mysql-server:/tmp

= Jobs and Cronjobs =
== Create Job from image ==
$ oc create job testjob --image registry.ocp.example.de:8443/rhel9/mysql-80 -- /bin/bash -c "create database events; mysql events -e 'source /tmp/dump.sql;'"

Cronjob:
$oc create cronjob mynewjob --image registry.ocp4.example.de:8443/ubi8/ubi:latest --schedule='* * * * 5' -- /bin/bash -c "if [ $(date +%H) -gt 15 ]; then echo 'Hands up, weekend!'; fi"

Check output of job:
$ oc logs job/<name>

= Secrets =
== Create Secret ==
=== from String ===
$ oc create secret generic <test> --from-literal=foo=bar

=== from file ===
$ oc create secret generic <sshkeys> --from-file id_rsa=/path-to/id_rsa --from-file id_rsa.pub=/path-to/id_rsa.pub

=== as TLS secret ===
$ oc create secret tls <secret-tls> --cert /tmp/mydomain.crt --key /tmp/mydomain.key

=== Update Secret ===
$ oc set data secret/<mysecret> --from-file /tmp/root-password

=== Extract secret ===
$ oc extract secret /<mysecret> --to /tmp/mysecret (--confirm)

= Configmaps =
== Create configmap from file ==
$ oc create configmap <mymap> --from-file=/tmp/dump.sql


== Other Information ==
== Other Information ==
Line 248: Line 307:


Show egress IPs:
Show egress IPs:
$ oc get hostsubnets
$ oc get hostsubnets (REVIEW!)
$ oc get egressips


Show/edit initial configuration:
Show/edit initial configuration:
Line 265: Line 325:
$ oc adm policy who-can patch machineconfigs
$ oc adm policy who-can patch machineconfigs


= Changes with '''patch''' command =
= Running =

== Projects/Namespaces ==

Switch '''namespace''':
$ oc project <namespace>
quit namespace:
$ oc project -n default

== Change resources ==

=== Reduce/Upgrade cpu/mem requests ===
Reduce memory requests:
$ oc set resources deployment/huge-mem --requests memory=250Mi

=== Environment variables ===
Set environment variables on running deployment:
$ oc set env deployment/helloworld MYSQL_USER=user1 MYSQL_PASSWORD=f00bar MYSQL_DATABASE=testdb

=== Change with '''patch''' command ===


'''Patch single value of resource:'''
'''Patch single value of resource:'''
Line 328: Line 369:
$ oc patch secret alertmanager-main -p '{"data": {"config.yaml": "'$(base64 -w0 /tmp/alertmanager.yaml)'"}}'
$ oc patch secret alertmanager-main -p '{"data": {"config.yaml": "'$(base64 -w0 /tmp/alertmanager.yaml)'"}}'


==== Examples ====
== Examples ==
'''Set master/worker to (un)paused:'''
'''Set master/worker to (un)paused:'''
$ oc patch --type=merge --patch='{"spec":{"paused":false}}' machineconfigpool/{master,worker}
$ oc patch --type=merge --patch='{"spec":{"paused":false}}' machineconfigpool/{master,worker}
Line 335: Line 376:
$ oc patch --type=merge --patch='{"spec":{"maxUnavailable":2}}' machineconfigpool/worker
$ oc patch --type=merge --patch='{"spec":{"maxUnavailable":2}}' machineconfigpool/worker
(default=1)
(default=1)

=== Restart deployment after change ===

$ oc rollout restart deployment testdeploy

(obsolete:
<br>the deployment resource has no rollout option -> You must patch something before it restarts e.g.:
$ oc patch deployment testdeploy --patch "{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"last-restart\":\"`date +'%s'`\"}}}}}"
)

=== Add probes ===
Configure readiness probe for deployment:
$ oc set probe deployment/<testdeploy> --readiness --failure-threshold 7 --get-url http://:3000/api/health

=== Scaling resources ===
Scale number of machines/nodes up/down:
$ oc scale --replicas=2 machineset <machineset> -n openshift-machine-api


Autoscale Pods
$ oc autoscale deployment/test --min 2 --max 10 --cpu-percent 80

=== Draining nodes ===
Empty node and put it into maintenance mode (e.g. before booting)
$ oc adm cordon <node1> (not necessary wgen you drain it - will be emptied anyway)
$ oc adm drain <node1> --delete-emptydir-data=true --ignore-daemonsets=true

After reboot:
$ oc adm uncordon <node1>


= Logging =
= Logging =
Line 400: Line 412:


= Information gathering =
= Information gathering =
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.12/html/support/gathering-cluster-data#support_gathering_data_gathering-cluster-data
https://docs.redhat.com/en/documentation/openshift_container_platform/4.21/html/support/gathering-cluster-data#support_gathering_data_gathering-cluster-data


== Must-gather ==
== Must-gather ==

Revision as of 09:20, 14 February 2026

Some helpful OpenShift commands which work (at least) since version >= 4.11

updated for version: 4.19

Login

How to get a token: https://oauth-openshift.apps.ocp.example.com/oauth/token/display

You might need it for login or automatization.

$ oc login --token=... --server=https://api.ocp.example.com:6443

Use the token directly against the API:

$ curl -H "Authorization: Bearer $TOKEN" https://api.ocp.example.com:6443/apis/user.openshift.io/v1/users/~"

Login with username/password:

$ oc login -u admin -p password https://api.ocp.example.com:6443

Get console URL:

$ oc whoami --show-console

CLI tool

Enable autocompletion

oc completion bash > /etc/profile.d/oc_completion_bash.sh

Resources

(in common)

Common info

General cluster/resource info:

$ oc cluster-info

Which resources are there?

$ oc api-resources (--namespaced=false)(--api-group=config.openshift.io)(00api-group=)
                 (in|without namespace)(openshift specific)(core-api-group only)

Explain resources:

$ oc explain service

Describe resources:

$ oc describe service

Inspect resources:

$ oc adm inspect deployment XYZ --dest-dir /home/student/inspection

(Attention: control resulting files for secrets, passwords, privatekeys etc. before sending somewhere) Get all resources:

$ oc get all

(Attention: templates, secrets, configmaps and pvcs will be shown outside resources)

$ oc get template,secret,cm,pvc

List resources in context of another user/serviceaccount:

$ oc get persistentvolumeclaims -n openshift-monitoring --as=system:serviceaccount:openshift-monitoring:default

Resources which are not shown with the "oc get all" command

$ oc api-resources --verbs=list --namespaced -o name | xargs -n 1 oc get --show-kind --ignore-not-found -n mynamespace

Nodes

Get status of all nodes:

$ oc get nodes

Get Logs of a node (and special unit)

$ oc adm node-logs <nodename> -u crio

Compare allocatable resources vs limits:

$ oc get nodes <nodename> -o jsonpath='{"Allocatable:\n"}{.status.allocatable}{"\n\n"}{"Capacity:\n"}{.status.capacity}{"\n"}'

Get resource consumption:

$ oc adm top nodes

Be careful ! Only the free memory is shown, not the allocatable memory. For a more realistic presentation do:

$ oc adm top nodes --show-capacity

( https://www.redhat.com/en/blog/using-oc-adm-top-to-monitor-memory-usage )

Draining nodes

Empty node and put it into maintenance mode (e.g. before booting)

$ oc adm cordon <node1> (not necessary when you drain it s. below - will be emptied anyway)
$ oc adm drain <node1> --delete-emptydir-data=true --ignore-daemonsets=true

After reboot:

$ oc adm uncordon <node1>

Machines

Show Uptime:

$ oc get machines -A

Get state paused/not paused of machineconfigpool:

$ oc get mcp worker -o jsonpath='{.spec.paused}'

Machinesets

Scale number of machines/nodes up/down:

$ oc scale --replicas=2 machineset <machineset> -n openshift-machine-api

Delete and re-create machines/nodes

oc get machines -A | grep worker-<XY> | wc -l

-> MACHINECOUNT

oc annotate machine/<machine-name> -n openshift-machine-api machine.openshift.io/delete-machine="true"
oc scale --replicas=<$MACHINECOUNT+1> machineset <machineset> n openshift-machine-api
oc scale --replicas=$MACHINECOUNT machineset <machineset> -n openshift-machine-api

Projects/Namespaces

Switch namespace:

$ oc project <namespace>

quit namespace:

$ oc project -n default

Registries

  • registry.access.redhat.com
  • registry.redhat.io (with login only)
  • quay.io
  • docker.io

Images

Search Images by help of podman:

$ podman search <wordpress>

Look into images:

oc image info registry.redhat.io:8443/ubi8/httpd-24:1-209 (-o json | jq -r .digest)

Update images on running deployment

oc set image deployment/mydb mariadb-80=docker.io/ubuntu18/mysql-80:1-228

Watching images directly on a node

crictl images
crictl ps --name httpd-24 -o yaml
crictl images --digests <shasum>

When you have account to a registry:

 skopeo login <registry>:8443 -u <username>
 skopeo inspect docker://registry.redhat.io:8443/ubi8/httpd-24:1-209
 skopeo inspect --config docker://registry.redhat.io/rhel8/httpd-24

Add the "latest" tag to a dedicated image:

skopeo copy docker://registry.redhat.io:8443/ubi8/httpd-24:1-215  docker://registry.redhat.io:8443/ubi8/httpd-24:latest

Create pod from image

$ skopeo login -u user -p password registry.redhat.io
$ skopeo list-tags docker://docker.io/nginx
$ oc run <mypod-nginx> --image docker://docker.io/nginx:stable-alpine (--env NGINX_VERSION=1.24.1)

Apps

Create new app

with label and parameters

from template

$ oc new-app (--name mysql-server) -l team=red --template=mysql-persistent -p MYSQL_USER=developer -p MYSQL_PASSWORD=topsecret

from image

$ oc new-app -l team=blue --image registry.redhat.com/rhel9/mysql-80:1 -e MYSQL_ROOT_PASSWORD=redhat -e MYSQL_USER=developer -e MYSQL_PASSWORD=evenmoresecret

Deployments

Create Deployment from image

$ oc create deployment demo-pod --port 3306  --image registry.ocp.example.de:8443/rhel9/mysql-80

Environment variables

Set environment variables on running deployment:

$ oc set env deployment/helloworld MYSQL_USER=user1  MYSQL_PASSWORD=f00bar MYSQL_DATABASE=testdb
oc set env deployment/mariadb MARIADB_DATABASE=wikidb
oc set env deployment/mariadb MARIADB_USER=mediawiki
oc set env deployment/mariadb MARIADB_PASSWORD=wikitopsecret
oc set env deployment/mariadb MARIADB_ROOT_PASSWORD=gehheim

(Not recommended for passwords; you'd better set secrets and configmaps, s. below)

oc set env deployment/mariadb --from=secret/my-secret (--prefix=MYSQL_)

Restart deployment after change

$ oc rollout restart deployment testdeploy

(obsolete:
the deployment resource has no rollout option -> You must patch something before it restarts e.g.:

$ oc patch deployment testdeploy --patch "{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"last-restart\":\"`date +'%s'`\"}}}}}"

)

Add volume to deployment

configmap

$ oc set volume deployment/<mydeployment> --add --type configmap --name <myvol> --configmap-name <mymap> --mount-path < /var/www/html >

pvc

$ oc set volume deployment/<mydeployment> --add --type pvc --name <mypvc-vol> --claim-name <mypvc> --mount-path < /var/lib/mysql> (--claim-class <storage class> --claim-mode RWX|RWO --claim-size 1G )

Remove volume from deployment

$ oc set volume deployment/file-sharing --remove --name=<vol-name>

Make deployment available from inside/outside

Create service from deployment

$ oc expose deployment <mydeployment> --name <service-mynewapp> (--selector app=<myapp>) --port 8080 --target-port 8080

Create route from service

$ oc expose service <service-mynewapp> --name <route-to-mynewapp>

Alternative ingress:

Create ingress for service

$ oc create ingress <ingress-mynewapp> --rule="mynewapp.ocp4.example.de/*=service-mynewapp:8080"

Afterwards the app is reachable from outside.

Add probes

Configure readiness probe for deployment:

$ oc set probe deployment/<testdeploy> --readiness --failure-threshold 7 --get-url http://:3000/api/health

Autoscale Pods

$ oc autoscale deployment/test --min 2 --max 10 --cpu-percent 80

Reduce/Upgrade cpu/mem requests

Reduce memory requests:

$ oc set resources deployment/huge-mem --requests memory=250Mi

Security

Problem web server

In some images web servers run on port 80 which leads to permission problems in OpenShift as security context constraints do not allow to run apps on privileged ports

Error message:

(13)Permission denied: AH00072: make_sock: could not bind to address [::]:80 (13)Permission denied: AH00072: make_sock: could not bind to address 0.0.0.0:80


-> either choose an image where port >= 1024 is used
-> or add permissions to the corresponding service account

$ oc get pod <your pod name> -o yaml | grep -i serviceAccountName
   serviceAccountName: default
$ oc adm policy add-scc-to-user anyuid -z default

(when you want to get rid of this setting again you have to edit the annotations field of the deployment and re-create the pod)

$ oc delete pod <your pod name>

Pods

Get resource consumption of all pods:

$ oc adm top pods -A --sum

Get resource consumption of pods and containers

$ oc adm top pods -n <openshift-etcd> --containers

Get all pods on a specific node:

$ oc get pods --field-selector spec.nodeName=ocp-abcd1-worker-0 (-l myawesomelabel)

Get only pods from deployment mysql:

$ oc get pods -l deploymentconfig=mysql

Get pods' readinessProbe:

 $ oc get pods -o jsonpath='{item[0].spec.containers[0].readinessProbe}' | jq

Connect to pod and open a shell:

$ oc exec -it <podname> -- /bin/bash

Copy file(s) to pod:

$ oc cp mysqldump.sql mysql-server:/tmp

Jobs and Cronjobs

Create Job from image

$ oc create job testjob --image registry.ocp.example.de:8443/rhel9/mysql-80 -- /bin/bash -c "create database events; mysql events -e 'source /tmp/dump.sql;'"

Cronjob:

$oc create cronjob mynewjob --image registry.ocp4.example.de:8443/ubi8/ubi:latest --schedule='* * * * 5' -- /bin/bash -c "if [ $(date +%H) -gt 15 ]; then echo 'Hands up, weekend!'; fi"

Check output of job:

$ oc logs job/<name>

Secrets

Create Secret

from String

$ oc create secret generic <test> --from-literal=foo=bar

from file

$ oc create secret generic <sshkeys> --from-file id_rsa=/path-to/id_rsa --from-file id_rsa.pub=/path-to/id_rsa.pub

as TLS secret

$ oc create secret tls <secret-tls> --cert /tmp/mydomain.crt --key /tmp/mydomain.key

Update Secret

$ oc set data secret/<mysecret> --from-file /tmp/root-password

Extract secret

$ oc extract secret /<mysecret> --to /tmp/mysecret (--confirm)

Configmaps

Create configmap from file

$ oc create configmap <mymap> --from-file=/tmp/dump.sql

Other Information

Sort Events by time:

$ oc get events --sort-by=lastTimestamp

Show egress IPs:

$ oc get hostsubnets (REVIEW!)
$ oc get egressips

Show/edit initial configuration:

$ oc get cm cluster-config-v1 -o yaml -n kube-system
  (edit)

List alerts:

$ oc -n openshift-monitoring exec -ti alertmanager-main-0 -c alertmanager -- amtool alert --alertmanager.url=http://localhost:9093 -o extended
List silences:
$ oc -n openshift-monitoring exec -ti alertmanager-main-0 -c alertmanager -- amtool silence query [alertname=ClusterNotUpgradable] --alertmanager.url=http://localhost:9093

https://cloud.redhat.com/blog/how-to-use-amtool-to-manage-red-hat-advanced-cluster-management-for-kubernetes-alerts

User rights to resources:

$ oc adm policy who-can <verb> <resource>
$ oc adm policy who-can patch machineconfigs

Changes with patch command

Patch single value of resource:

$ oc patch installplan install-defgh -n openshift-operators-redhat --type merge  --patch '{"spec":{"approved":true}}'

Patch resource by help of a file:

$ oc patch --type=merge mc 99-worker-ssh --patch-file=/tmp/patch_mc-worker-ssh.yaml

Content of patch_mc-worker-ssh.yaml:

spec:
  config:
    passwd:
      users:
      - name: core
        sshAuthorizedKeys:
        - |
          ssh-rsa AAAAB3NzaZ1yc2EAAAADAQABAAABAQDOMsVGOvN3ap+MWr7eqZpBfDLTcmFdKhozJGStwXsTrP6QJYlxwP1ITZH7tPMfD0zkHu+y7XzcPqybwmnK4hPhuzxUl4qXqdTkTUUJjy3eVPk7n3RHHdsI2yS5YnlcySnTvkYAOuMStDDhN1MF6xOwxqXOq6xalzZzt7j/MtcceHxIdB19i0Fp4XYRTfv9p3UTFFkP9DoRnspNI0TtIg8YfzYcHJy/bDhEfi6+t0UBcksUqVWpVY2jX2Nco1qfC+/E2ooWalMzYUsB4ctU4OqiLd5qxmMevn9J+knPVhiWLE41d7dReVHkNyao2HZUH1r6E6B7/n/m0+XS0qJeA0Hh testy@pc01
          ssh-rsa AAABBBCCC0815....QWertzu007Xx foobar@pc02

Attention: former content of sshAuthorizedKeys will be overwritten !

Patch secret with base64 encoded data:
Create yaml file with content:

$ head /tmp/alertmanager.yaml
global:
 resolve_timeout: 5m
 smtp_from: openshift-admin@example.de
 smtp_smarthost: 'loghorst.example.de:25'
 smtp_hello: localhost
 (...)
$ tail /tmp/alertmanager.yaml
(...)
time_intervals:
 - name: work_hours
   time_intervals:
     - weekdays: ["monday:friday"]
       times:
         - start_time: "07:00"
           end_time: "17:00"
       location: Europe/Zurich
$ oc patch secret alertmanager-main -p '{"data": {"config.yaml": "'$(base64 -w0 /tmp/alertmanager.yaml)'"}}'

Examples

Set master/worker to (un)paused:

$ oc patch --type=merge --patch='{"spec":{"paused":false}}' machineconfigpool/{master,worker}

Set maximum number of unavailable workers to 2:

$ oc patch --type=merge --patch='{"spec":{"maxUnavailable":2}}' machineconfigpool/worker

(default=1)

Logging

Watch logs of a certain pod (or container)

$ oc logs <podname> (-c <container>)

Debug pod (e.g. if crashloopbacked):

$ oc debug pod/<podname>

Node logs of systemunit crio:

$ oc adm node-logs master01 -u crio --tail 2

The same of all masters:

$ oc adm node-logs --role master -u crio --tail 2

Liveness/Readiness Probes of all pods in certain timestamp:

$ oc adm node-logs --role worker -u kubelet | egrep -E 'Liveness|Readiness' | grep "Aug 21 11:22"

Space allocation of logging:

$ POD=elasticsearch-cdm-<ID>
$ oc -n openshift-logging exec $POD -c elasticsearch -- es_util --query=_cat/allocation?v\&pretty=true

Watch audit logs:

$ oc adm node-logs --role=master --path=openshift-apiserver/

Watch audit.log from certain node:

$ oc adm node-logs ocp-abcdf-master-0 --path=openshift-apiserver/audit-2023-09-26T14-11-04.448.log

Search string:

$ oc adm node-logs ocp-abcdf-master-0 --path=openshift-apiserver/audit-2023-09-26T14-11-04.448.log | jq 'select(.verb == "delete")'
$ oc adm node-logs ocp-46578-master-1 --path=openshift-apiserver/audit.log | jq 'select(.verb == "delete" and .objectRef.resource != "routes" and .objectRef.resource != "templateinstances" and .objectRef.resource != "rolebindings" )' 

Source:
https://docs.openshift.com/container-platform/4.12/security/audit-log-view.html

Information gathering

https://docs.redhat.com/en/documentation/openshift_container_platform/4.21/html/support/gathering-cluster-data#support_gathering_data_gathering-cluster-data

Must-gather

$ oc adm must-gather

-> create must-gather.local.XXXXXX

https://docs.openshift.com/container-platform/4.12/cli_reference/openshift_cli/administrator-cli-commands.html#oc-adm-inspect (evtl. delete secrets!)

SOS Report

https://access.redhat.com/solutions/4387261

Inspect

Get information resource-wise and for a certain period:

$ oc adm inspect clusteroperator/kube-apiserver --dest-dir /tmp/kube-apiserver --since 1m

Special cases

Namespace not deletable

Namespace gets stuck in status terminating

Watch out for secrets that are left over and not deletable. Set the finalizer to Null:

$ oc patch secrets $SECRET -n ocp-cluster-iam-entw  -p '{"metadata":{"finalizers":[]}}' --type=merge

Run containers as root

Should only be done as last instance or for temporary tests as attackers could theoretically break out of the containers and become root on the system.

In the deployment add following lines under the "spec" statement:

spec:
  containers:
    securityContext:
      runAsUser: 0

You must give admin privileges to the serviceaccount under which the deployment runs. If nothing is configured it is normally the default user:

# oc project <myproject>
# oc adm policy add-scc-to-user anyuid -z default

App URLs

Kibana

https://kibana-openshift-logging.apps.ocp.example.com/

ArgoCD

https://openshift-gitops-server-openshift-gitops.apps.ocp.example.com

Useful terms

IPI Installer-provisioned infrastructure cluster
Cluster installed by install command; user must only provide some information (which platform, cluster name, network, storage, ...)

UPI User provisioned infrastructure cluster

  • DNS and Loadbalancing must already be there
  • Installation manually, download ova file (in case of vSphere)
  • master created manually
  • workers recommended
  • *no* keepalived

Advantages:
IPI: installation more simple, using preconfigured features
UPI: more flexibility, no loadbalancer outage during update

Change from IPI -> UPI not possible


You can get more shortcuts by typing:

$ oc api-resources
cm config map
csv cluster service version
dc deploymentconfig
ds daemonset
ip installplan
mcp machineconfigpool
pv persistent volume
sa service account
scc security context constraints
svc service