Starting Monday night (3/17 into 3/18), again on Tuesday night and persisting since the all clear of the BB issue reported today all of our pipelines which upload a large (~300MB) build artifact failed in a manner similar to what was reported here. Namely we are getting errors related to a bad PUT in the log file and then the same "Error occurred whilst processing an artifact" from the linked issue.
As per the instructions in the linked ticket I tried updating the timeout first to 120 and then to 300 seconds. I can see that the option took effect based on the Java invocation now containing "-Dbitbucket.pipelines.runner.s3.read.timeout.seconds=300". The same failure was observed with the larger timeouts.
Oddly, my personal fork of the same repo, on my personal runner (located in the same building but on a slightly different internal network) succeeds at the artifact upload. In that case the upload takes 5 seconds. A speedtest on the problematic runner shows reasonable upload/download speeds:
Idle Latency: 1.50 ms (jitter: 0.06ms, low: 1.45ms, high: 1.56ms) Download: 937.67 Mbps (data used: 445.1 MB) 21.06 ms (jitter: 0.62ms, low: 1.50ms, high: 23.79ms) Upload: 69.49 Mbps (data used: 215.8 MB) 1.68 ms (jitter: 0.16ms, low: 1.49ms, high: 7.58ms) Packet Loss: 0.0%
Here is what I believe is the relevant runner.log information:
[2025-03-18 10:13:25,279] {"traceId":"67d999e2b7087db286b5ac9ead22b19f","parentId":"86b5ac9ead22b19f","id":"a7d8ead32d715041","kind":"CLIENT","name":"PUT","timestamp":1742314404832196,"duration":446803,"localEndpoint":{"serviceName":"runner","ipv4":"..."},"tags":{"http.method":"PUT","http.path":"/ex/bitbucket-pipelines/rest/internal/accounts/{...}/repositories/{...}/runners/{...}/state"}} [2025-03-18 10:13:26,292] {"traceId":"67d999e2b7087db286b5ac9ead22b19f","parentId":"1813dff512cd8587","id":"e7f32c1fbca42dbc","kind":"CLIENT","name":"PUT","timestamp":1742314364967942,"duration":41324820,"localEndpoint":{"serviceName":"runner","ipv4":"..."},"tags":{"http.method":"PUT","http.path":"/artifact/{...}/{...}/{...}/artifact_{...}.tar.gz","http.status_code":"400","error":"400"}} [2025-03-18 10:13:26,295] {"traceId":"67d999e2b7087db286b5ac9ead22b19f","parentId":"1813dff512cd8587","id":"82dd27f15c81814e","kind":"CLIENT","name":"PUT","timestamp":1742314364983890,"duration":41311669,"localEndpoint":{"serviceName":"runner","ipv4":"..."},"tags":{"http.method":"PUT","http.path":"/artifact/{...}/{...}/{...}/artifact_{...}.tar.gz","error":"CANCELLED"}} [2025-03-18 10:13:26,296] {"traceId":"67d999e2b7087db286b5ac9ead22b19f","parentId":"1813dff512cd8587","id":"60ceb2b8bd079033","kind":"CLIENT","name":"PUT","timestamp":1742314364990912,"duration":41305081,"localEndpoint":{"serviceName":"runner","ipv4":"..."},"tags":{"http.method":"PUT","http.path":"/artifact/{...}/{...}/{...}/artifact_{...}.tar.gz","error":"CANCELLED"}} [2025-03-18 10:13:26,296] {"traceId":"67d999e2b7087db286b5ac9ead22b19f","parentId":"1813dff512cd8587","id":"6b6ec916ef1d1d21","kind":"CLIENT","name":"PUT","timestamp":1742314364981720,"duration":41314511,"localEndpoint":{"serviceName":"runner","ipv4":"..."},"tags":{"http.method":"PUT","http.path":"/artifact/{...}/{...}/{...}/artifact_{...}.tar.gz","error":"CANCELLED"}} [2025-03-18 10:13:26,296] {"traceId":"67d999e2b7087db286b5ac9ead22b19f","parentId":"1813dff512cd8587","id":"5abb9246576f2156","kind":"CLIENT","name":"PUT","timestamp":1742314364991277,"duration":41305218,"localEndpoint":{"serviceName":"runner","ipv4":"..."},"tags":{"http.method":"PUT","http.path":"/artifact/{...}/{...}/{...}/artifact_{...}.tar.gz","error":"CANCELLED"}} [2025-03-18 10:13:26,298] Error while uploading file to s3 org.springframework.web.reactive.function.client.WebClientResponseException$BadRequest: 400 Bad Request from PUT https://micros--prod-east--bitbucketci-file-service--files.s3.amazonaws.com/artifact/... at org.springframework.web.reactive.function.client.WebClientResponseException.create(WebClientResponseException.java:217) Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException: Error has been observed at the following site(s): *__checkpoint ⇢ 400 from PUT https://micros--prod-east--bitbucketci-file-service--files.s3.amazonaws.com/artifact/... Original Stack Trace: at org.springframework.web.reactive.function.client.WebClientResponseException.create(WebClientResponseException.java:217) at org.springframework.web.reactive.function.client.DefaultClientResponse.lambda$createException$1(DefaultClientResponse.java:207) at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:106) at reactor.core.publisher.FluxOnErrorReturn$ReturnSubscriber.onNext(FluxOnErrorReturn.java:162) at reactor.core.publisher.FluxDefaultIfEmpty$DefaultIfEmptySubscriber.onNext(FluxDefaultIfEmpty.java:101) at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129) at reactor.core.publisher.FluxContextWrite$ContextWriteSubscriber.onNext(FluxContextWrite.java:107) at reactor.core.publisher.FluxMapFuseable$MapFuseableConditionalSubscriber.onNext(FluxMapFuseable.java:299) at reactor.core.publisher.FluxFilterFuseable$FilterFuseableConditionalSubscriber.onNext(FluxFilterFuseable.java:337) at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1839) at reactor.core.publisher.MonoCollect$CollectSubscriber.onComplete(MonoCollect.java:160) at reactor.core.publisher.FluxMap$MapSubscriber.onComplete(FluxMap.java:144) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onComplete(ScopePassingSpanSubscriber.java:103) at reactor.core.publisher.FluxPeek$PeekSubscriber.onComplete(FluxPeek.java:260) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onComplete(ScopePassingSpanSubscriber.java:103) at reactor.core.publisher.FluxMap$MapSubscriber.onComplete(FluxMap.java:144) at reactor.netty.channel.FluxReceive.onInboundComplete(FluxReceive.java:415) at reactor.netty.channel.ChannelOperations.onInboundComplete(ChannelOperations.java:439) at reactor.netty.channel.ChannelOperations.terminate(ChannelOperations.java:493) at reactor.netty.http.client.HttpClientOperations.onInboundNext(HttpClientOperations.java:796) at reactor.netty.channel.ChannelOperationsHandler.channelRead(ChannelOperationsHandler.java:115) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:289) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436) at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318) at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1503) at io.netty.handler.ssl.SslHandler.decodeNonJdkCompatible(SslHandler.java:1377) at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1417) at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:530) at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:469) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1357) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:868) at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:799) at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:501) at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:399) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:840) Wrapped by: com.atlassian.pipelines.runner.core.exception.S3UploadException: Failed to upload chunk, part number 5 at com.atlassian.pipelines.runner.core.util.file.upload.S3MultiPartUploaderImpl.lambda$uploadChunk$16(S3MultiPartUploaderImpl.java:167) at io.reactivex.internal.operators.single.SingleResumeNext$ResumeMainSingleObserver.onError(SingleResumeNext.java:73) at io.reactivex.internal.operators.flowable.FlowableSingleSingle$SingleElementSubscriber.onError(FlowableSingleSingle.java:97) at io.reactivex.subscribers.SerializedSubscriber.onError(SerializedSubscriber.java:142) at io.reactivex.internal.operators.flowable.FlowableRepeatWhen$WhenReceiver.onError(FlowableRepeatWhen.java:112) at io.reactivex.internal.operators.flowable.FlowableFlatMap$MergeSubscriber.checkTerminate(FlowableFlatMap.java:572) at io.reactivex.internal.operators.flowable.FlowableFlatMap$MergeSubscriber.drainLoop(FlowableFlatMap.java:379) at io.reactivex.internal.operators.flowable.FlowableFlatMap$MergeSubscriber.drain(FlowableFlatMap.java:371) at io.reactivex.internal.operators.flowable.FlowableFlatMap$MergeSubscriber.innerError(FlowableFlatMap.java:611) at io.reactivex.internal.operators.flowable.FlowableFlatMap$InnerSubscriber.onError(FlowableFlatMap.java:677) at io.reactivex.internal.subscriptions.EmptySubscription.error(EmptySubscription.java:55) at io.reactivex.internal.operators.flowable.FlowableError.subscribeActual(FlowableError.java:40) at io.reactivex.Flowable.subscribe(Flowable.java:14935) at io.reactivex.Flowable.subscribe(Flowable.java:14882) at io.reactivex.internal.operators.flowable.FlowableFlatMap$MergeSubscriber.onNext(FlowableFlatMap.java:163) at io.reactivex.internal.operators.flowable.FlowableDoOnEach$DoOnEachSubscriber.onNext(FlowableDoOnEach.java:92) at io.reactivex.internal.operators.flowable.FlowableFlatMap$MergeSubscriber.tryEmitScalar(FlowableFlatMap.java:234) at io.reactivex.internal.operators.flowable.FlowableFlatMap$MergeSubscriber.onNext(FlowableFlatMap.java:152) at io.reactivex.internal.operators.flowable.FlowableZip$ZipCoordinator.drain(FlowableZip.java:249) at io.reactivex.internal.operators.flowable.FlowableZip$ZipSubscriber.onNext(FlowableZip.java:381) at io.reactivex.processors.UnicastProcessor.drainFused(UnicastProcessor.java:362) at io.reactivex.processors.UnicastProcessor.drain(UnicastProcessor.java:395) at io.reactivex.processors.UnicastProcessor.onNext(UnicastProcessor.java:457) at io.reactivex.processors.SerializedProcessor.onNext(SerializedProcessor.java:103) at io.reactivex.internal.operators.flowable.FlowableRepeatWhen$WhenSourceSubscriber.again(FlowableRepeatWhen.java:171) at io.reactivex.internal.operators.flowable.FlowableRetryWhen$RetryWhenSubscriber.onError(FlowableRetryWhen.java:76) at io.reactivex.internal.operators.single.SingleToFlowable$SingleToFlowableObserver.onError(SingleToFlowable.java:67) at io.reactivex.internal.operators.single.SingleUsing$UsingSingleObserver.onError(SingleUsing.java:175) at io.reactivex.internal.operators.single.SingleMap$MapSingleObserver.onError(SingleMap.java:69) at io.reactivex.internal.operators.single.SingleMap$MapSingleObserver.onError(SingleMap.java:69) at io.reactivex.internal.operators.single.SingleObserveOn$ObserveOnSingleObserver.run(SingleObserveOn.java:79) at brave.propagation.CurrentTraceContext$1CurrentTraceContextRunnable.run(CurrentTraceContext.java:264) at com.atlassian.pipelines.common.trace.rxjava.CopyMdcSchedulerHandler$CopyMdcRunnableAdapter.run(CopyMdcSchedulerHandler.java:74) at io.reactivex.Scheduler$DisposeTask.run(Scheduler.java:608) at brave.propagation.CurrentTraceContext$1CurrentTraceContextRunnable.run(CurrentTraceContext.java:264) at com.atlassian.pipelines.common.trace.rxjava.CopyMdcSchedulerHandler$CopyMdcRunnableAdapter.run(CopyMdcSchedulerHandler.java:74) at io.reactivex.internal.schedulers.ScheduledRunnable.run(ScheduledRunnable.java:66) at io.reactivex.internal.schedulers.ScheduledRunnable.call(ScheduledRunnable.java:57) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840)
What debugging can/should we do to get to the bottom of this problem?
Hi John,
Can you please try to delete the runner and re-create it? There's a chance there's some form of cache that may be causing issues.
Also, please make sure the runner is updated to the latest version (if it's not already):
If that does not work - we'll need to raise a ticket on your behalf. Can you please let me know which timezone you are in?
Cheers!
- Ben (Bitbucket Cloud Support)
Thanks for the feedback. I'll try to re-create the runner.
I also reached out to my local IT folks and they are investigating some internal network weirdness with other applications. (Also large file pushes not working properly). At the moment I believe that the root cause is likely an internal network problem but until they resolve (or they tell me it is all working properly) that I can't say for sure.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.