Symptoms
All orders fail or get rescheduled with the below error:
Rating Engine call error: HTTP operation failed with error code 28: Operation timed out after 30001 milliseconds with 0 out of -1 bytes received
Cause
If all orders are affected, it probably means that communication between Rating Engine and other parts of the platform is not working.
Resolution
Check if all pods are running properly:
[root@osscore ~]# kubectl get pod NAME READY STATUS RESTARTS AGE a8n-operator-d466588c-d55hm 1/1 Running 2 170d camunda-rest-67877fcb74-r5gzg 0/1 CrashLoopBackOff 232 171d connect-cbc-adapter-7dd85dcb57-lpcxg 1/1 Running 131 2y138d inhouse-products-6b9cf9f745-gcflq 0/1 Running 252 171d order-management-b87856777-tm6bg 0/1 Running 26 52d ratingengine-backend-7f4b567746-k66hg 0/1 Running 23 52d ratingenginepayg-8dbb6949b-dtwzd 0/1 Running 20 52d ux1-marketplace-connector-7c758f9559-xll5r 0/1 Running 317 543d ux1-marketplace-elastic-548bc45f6f-prs59 1/1 Running 28 543d ux1-ui-6c776d47c7-ff96x 1/1 Running 76 543d [root@osscore ~]#
Status "0/1" above means pods are not ready.
Check system pods
[root@osscore ~]# kubectl get pod -n kube-system NAME READY STATUS RESTARTS AGE coredns-576cbf47c7-h8hhm 0/1 CrashLoopBackOff 294 270d coredns-576cbf47c7-t6gpc 0/1 CrashLoopBackOff 293 270d etcd-link8s.domain.local 1/1 Running 99968 4y66d kube-apiserver-link8s.domain.local 1/1 Running 78773 4y66d kube-controller-manager-link8s.domain.local 1/1 Running 498 4y66d kube-proxy-rh924 1/1 Running 2 270d kube-router-lsn8v 0/1 ImagePullBackOff 43 270d kube-scheduler-link8s.domain.local 1/1 Running 494 4y66d tiller-deploy-6f6fd74b68-np96p 1/1 Running 112 270d [root@osscore ~]#
As you can see in the above example, coredns and kube-router pods are not well.
Report the issue to your K8s cluster administrator.
The following article could be useful to them: