We're facing an issue while running our pipeline built with the atlassian/ssh-run pipe. This has been working smoothly up until just recently, here is the stack trace we're seeing in the bitbucket-pipelines-runner docker log:
[2022-10-04 18:31:00,606] An error occurred whilst inspecting container.
com.github.dockerjava.api.exception.NotFoundException: Status 404: {"message":"No such container: 6886979a-4d81-5af5-8793-4d69d1fe32ea_d409412b-a384-4d15-a566-f3e33493d5ac_system_docker"}
at com.github.dockerjava.netty.handler.HttpResponseHandler.channelRead0(HttpResponseHandler.java:97)
at com.github.dockerjava.netty.handler.HttpResponseHandler.channelRead0(HttpResponseHandler.java:32)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:280)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:327)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:299)
at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:800)
at io.netty.channel.epoll.EpollDomainSocketChannel$EpollDomainUnsafe.epollInReady(EpollDomainSocketChannel.java:138)
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Unknown Source)
We have recently implemented Server Protection with Sophos so have a strong suspicion this is related to the new issue. What is strange is that running the uninstall script provided by Sophos doesn't fix the issue.
Is this familiar?
Hi @aslane,
Please make sure to sanitize private/sensitive info, if any, prior to sharing it.
Kind regards,
Theodora
Interestingly we've discovered the problem is related to the Sophos installation interfering with AWS ECS and the process of deploying new tasks.
Upon the Sophos installation, the amazon-ecs-volume-plugin.service is fine but as soon as an ECS task is launched, it is made inactive. I'm going to inform both AWS and Sophos of this issue.
Running a simple command to restart the service (systemctl restart amazon-ecs-volume-plugin) fixes the issue, and from what I can tell, the issue doesn't come about again, curiously. This service being restarted also fixed the pipeline issue we were seeing.
I just wanted to let you know of the situation, in case other users encounter this.
In any case, this ticket can be closed.
Thank you for your time!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi @aslane,
It's good to hear that you figured this out, and thank you for sharing more details about what was happening and the resolution.
Please feel free to reach out if you ever need anything else!
Kind regards,
Theodora
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.