Symptoms


All orders fail or get rescheduled with the below error:


Rating Engine call error: HTTP operation failed with error code 28: Operation timed out after 30001 milliseconds with 0 out of -1 bytes received


Cause

If all orders are affected, it probably means that communication between Rating Engine and other parts of the platform is not working.


Resolution

Check if all pods are running properly:

[root@osscore ~]# kubectl get pod
NAME                                         READY   STATUS             RESTARTS   AGE
a8n-operator-d466588c-d55hm                  1/1     Running            2          170d
camunda-rest-67877fcb74-r5gzg                0/1     CrashLoopBackOff   232        171d
connect-cbc-adapter-7dd85dcb57-lpcxg         1/1     Running            131        2y138d
inhouse-products-6b9cf9f745-gcflq            0/1     Running            252        171d
order-management-b87856777-tm6bg             0/1     Running            26         52d
ratingengine-backend-7f4b567746-k66hg        0/1     Running            23         52d
ratingenginepayg-8dbb6949b-dtwzd             0/1     Running            20         52d
ux1-marketplace-connector-7c758f9559-xll5r   0/1     Running            317        543d
ux1-marketplace-elastic-548bc45f6f-prs59     1/1     Running            28         543d
ux1-ui-6c776d47c7-ff96x                      1/1     Running            76         543d
[root@osscore ~]#

Status "0/1" above means pods are not ready.


Check system pods

[root@osscore ~]# kubectl get pod -n kube-system
NAME                                       READY   STATUS             RESTARTS   AGE
coredns-576cbf47c7-h8hhm                   0/1     CrashLoopBackOff   294        270d
coredns-576cbf47c7-t6gpc                   0/1     CrashLoopBackOff   293        270d
etcd-link8s.domain.local                      1/1     Running            99968      4y66d
kube-apiserver-link8s.domain.local            1/1     Running            78773      4y66d
kube-controller-manager-link8s.domain.local   1/1     Running            498        4y66d
kube-proxy-rh924                           1/1     Running            2          270d
kube-router-lsn8v                          0/1     ImagePullBackOff   43         270d
kube-scheduler-link8s.domain.local            1/1     Running            494        4y66d
tiller-deploy-6f6fd74b68-np96p             1/1     Running            112        270d
[root@osscore ~]#

As you can see in the above example, coredns and kube-router pods are not well.


Report the issue to your K8s cluster administrator.


The following article could be useful to them:

https://cloudblue.freshdesk.com/support/solutions/articles/44002433308-microservices-not-in-ready-state-due-to-kube-router-in-errimagepullerr