We have some pretty memory-intensive Bambo builds. Even with remote agents that have 24GB of RAM, we still manage to chew through it. On users' machines, this just results in slower builds. But in Bamboo, it causes the agent to die. I was wondering what my options are to keep the agent alive. Is allocating more memory in the wrapper.conf file reasonable (currently, wrapper.java.initmemory and maxmemory are both set to 1024)? Is there something else I can set to make the agent service more tolerant?
FYI, the logs from the agent look like this. I can't make sense of it, but perhaps it might give someone else a clue and can point me in the right direction:
2012-05-30 15:06:47,293 INFO [QuartzScheduler_QuartzSchedulerThread] [SystemInfo] Can't get free disk space. A warning returned. java.io.IOException: Cannot run program "df": java.io.IOException: error=12, Cannot allocate memory at java.lang.ProcessBuilder.start(ProcessBuilder.java:460) at java.lang.Runtime.exec(Runtime.java:593) at java.lang.Runtime.exec(Runtime.java:466) at org.apache.commons.io.FileSystemUtils.openProcess(FileSystemUtils.java:454) at org.apache.commons.io.FileSystemUtils.performCommand(FileSystemUtils.java:404) at org.apache.commons.io.FileSystemUtils.freeSpaceUnix(FileSystemUtils.java:323) at org.apache.commons.io.FileSystemUtils.freeSpaceOS(FileSystemUtils.java:196) at org.apache.commons.io.FileSystemUtils.freeSpaceKb(FileSystemUtils.java:166) at com.atlassian.bamboo.configuration.SystemInfo.<init>(SystemInfo.java:140) at com.atlassian.bamboo.configuration.SystemInfoFactory.getSystemInfo(SystemInfoFactory.java:35)
FYI, this is on Bamboo v3.2. Perhaps newer versions of Bamboo are more resiliant?
The log you've shown means that the system did not have enough memory to spawn a subprocess. If it's the only error you're getting, it's likely that the problem is not with the agent but other processes allocating too much memory.
Try monitoring the memory consuption on your agent host. Looping "ps v; sleep 5" should be enough to find what's taking up memory.
We did find out that someone created a test that consumes large amounts of memory (way more than he expected). From a systems point of view, I get it -- systems that run out of memory do bad things. From the users' point of view, they don't think the entire agent should die because of a "bad" test. So I guess my question morphs into -- how do I prevent bad processes spun off from a Bamboo build from killing the agent?
Is there something we can set up in Bamboo to fail builds when the machine reaches a certain memory threshold? Or will we need to setup up an external memory-watching script that will fail the build when the system reaches a certain point?
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
In the worst case scenario, OS will trigger OOM killer and just kill the agent - there's nothing you can do about this.
I don't think the stack trace is the reason for which your agent dies, probably the system kills it.
If your tests are in Java, you can limit the maximum heap consumption.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.