Symptoms



Platform became unavailable, not possible to login to control panel. In core.log, it can be seen that any task execution fails with the following exception:


Aug  6 05:43:28.022 : DBG []: c.p.p.s.p.e.PeriodicTaskCommandProcessor Task with id: 8742111 failed with exception: javax.ejb.EJBException: org.jboss.as.ee.component.ComponentIsStoppedException: WFLYEE0043: Component is stopped

The following error could be seen in console.log on Management node:


200806 05:43:24,285 WARN [org.apache.activemq.artemis.core.server] (Thread-11091 (ActiveMQ-IO-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@36e2724f)) AMQ222010: Critical IO Error, shutting down the server. file=AIOSequentialFile:/usr/local/pem/wildfly-16.0.0.Final/standalone/data/activemq/journal/activemq-data-2855.amq, message=Timeout on close: java.io.IOException: Timeout on close
at org.apache.activemq.artemis.journal@2.6.3.jbossorg-00014//org.apache.activemq.artemis.core.io.aio.AIOSequentialFile.close(AIOSequentialFile.java:126)
at org.apache.activemq.artemis.journal@2.6.3.jbossorg-00014//org.apache.activemq.artemis.core.io.aio.AIOSequentialFile.close(AIOSequentialFile.java:103)

200806 05:43:24,285 WARN [org.apache.activemq.artemis.journal] (Thread-11091 (ActiveMQ-IO-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@36e2724f)) waiting pending callbacks on activemq-data-2855.amq from 10 seconds!

Cause

 

Critical Input/Output error which led to shutdown of core services on Management node. Such actions is done automatically by the platform in order to avoid loss or corruption of data.


Resolution


In order to restore serviceability of the platform, restart services on Management node:


service pa-agent stop
service pau restart
service pa-agent start

Check disk storage health on a hardware level in order to avoid such issues in the future.