Runner throws an error and stop responding after a while

Hello.

We are using the runner in a single AWS EC2 instance. The runner works perfectly for a few hours but after a while, when is not used, it throws an error and then it gets not responsive. I don't see any memory or cpu issue on the machine. Once I restart the container everything comes back to normal.

ERROR:

An error occurred whilst listening to RUNNER_UPDATED websocket events.
java.io.EOFException: null
    at okio.RealBufferedSource.require(RealBufferedSource.java:65)
    at okio.RealBufferedSource.readByte(RealBufferedSource.java:78)
    at okhttp3.internal.ws.WebSocketReader.readHeader(WebSocketReader.java:117)
    at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:101)
    at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:273)
    at okhttp3.internal.ws.RealWebSocket$1.onResponse(RealWebSocket.java:209)
    at okhttp3.RealCall$AsyncCall.execute(RealCall.java:174)
    at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.base/java.lang.Thread.run(Unknown Source)
Exception in thread "main" java.lang.RuntimeException: java.io.EOFException
    at io.reactivex.internal.util.ExceptionHelper.wrapOrThrow(ExceptionHelper.java:46)
    at io.reactivex.internal.observers.BlockingMultiObserver.blockingGet(BlockingMultiObserver.java:93)
    at io.reactivex.Completable.blockingAwait(Completable.java:1227)
    at com.atlassian.pipelines.runner.core.ApplicationImpl.main(ApplicationImpl.java:47)
Caused by: java.io.EOFException
    at okio.RealBufferedSource.require(RealBufferedSource.java:65)
    at okio.RealBufferedSource.readByte(RealBufferedSource.java:78)
    at okhttp3.internal.ws.WebSocketReader.readHeader(WebSocketReader.java:117)
    at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:101)
    at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:273)
    at okhttp3.internal.ws.RealWebSocket$1.onResponse(RealWebSocket.java:209)
    at okhttp3.RealCall$AsyncCall.execute(RealCall.java:174)
    at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.base/java.lang.Thread.run(Unknown Source)

 

5 comments

Comment

Log in or Sign up to comment
Paulo Antoniassi April 19, 2021

Same here and exactly the same trace. Looks like its random.

I'm running the latest image version updated on friday, 2021-04-16.

Like Jorge Cotelo likes this
Jorge Cotelo April 19, 2021

I also updated to the latest image version on Friday and by Saturday afternoon the runner was back offline with the same error :(.

update: I see there is a new image update this morning so now I am running the version `1.108` 

lassian
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
April 19, 2021

Hi,

When it goes offline can you post me the full set of logs for the runner from before and after rather than just the error, as I would like to see if its still reporting the health check or if theres any errors reporting health.

Alternatively when its not responding for more than 5 minutes can you navigate to the repository admin runner page and tell me what the status of the runner is, e.g. is it ONLINE or OFFLINE when its not responding?

Kind Regards,

Nathan Burrell 

Paulo Antoniassi April 20, 2021

Hi,

Before the error is just hours of health check. There is no log after that until the container is restarted.

The admin page shows that server as "OFFLINE" a few minutes after the error.

[2021-04-18 14:25:09,594] Updating runner state to "ONLINE".
[2021-04-18 14:25:39,595] Updating runner state to "ONLINE".
[2021-04-18 14:26:09,595] Updating runner state to "ONLINE".
[2021-04-18 14:26:39,594] Updating runner state to "ONLINE".
[2021-04-18 14:27:09,594] Updating runner state to "ONLINE".
[2021-04-18 14:27:34,794] An error occurred whilst listening to RUNNER_UPDATED websocket events.
java.io.EOFException: null
at okio.RealBufferedSource.require(RealBufferedSource.java:65)
at okio.RealBufferedSource.readByte(RealBufferedSource.java:78)
at okhttp3.internal.ws.WebSocketReader.readHeader(WebSocketReader.java:117)
at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:101)
at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:273)
at okhttp3.internal.ws.RealWebSocket$1.onResponse(RealWebSocket.java:209)
at okhttp3.RealCall$AsyncCall.execute(RealCall.java:174)
at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
Exception in thread "main" java.lang.RuntimeException: java.io.EOFException
at io.reactivex.internal.util.ExceptionHelper.wrapOrThrow(ExceptionHelper.java:46)
at io.reactivex.internal.observers.BlockingMultiObserver.blockingGet(BlockingMultiObserver.java:93)
at io.reactivex.Completable.blockingAwait(Completable.java:1227)
at com.atlassian.pipelines.runner.core.ApplicationImpl.main(ApplicationImpl.java:47)
Caused by: java.io.EOFException
at okio.RealBufferedSource.require(RealBufferedSource.java:65)
at okio.RealBufferedSource.readByte(RealBufferedSource.java:78)
at okhttp3.internal.ws.WebSocketReader.readHeader(WebSocketReader.java:117)
at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:101)
at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:273)
at okhttp3.internal.ws.RealWebSocket$1.onResponse(RealWebSocket.java:209)
at okhttp3.RealCall$AsyncCall.execute(RealCall.java:174)
at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
Jorge Cotelo April 26, 2021

Hi,

Just to confirm I am not longer getting the error with the new image. Now I am just having the same issue as @Paulo Antoniassi  where the step get hang.

This issue can be marked as resolved.

TAGS
AUG Leaders

Atlassian Community Events