More Build Agents, More Concurrency

Starting from QuickBuild 4.0, one grid node (server node and build agent node) can only run one build at a time. So it is necessary to install multiple build agents on powerful machine to make full use of the machine resource. If the machine running QuickBuild server has additional resource, you may also install build agents into the server machine.

At a time, a grid node can only run steps of a single build. Step will be put into waiting state if no eligible node can be found. A node is considered eligible for a requesting step if its node selection setting is satisfied and if the node does not currently run any steps belonging to other builds.

If multiple nodes are eligible, QuickBuild will select the fastest node based on the benchmark test.

To help understand this, we assume that:

Grid contains server node, build agent1, and build agent2. Agent1 is faster than agent2.
Node selection setting of build A's master step matches any agent node. The master step contains step1 whose node selection setting matches server node.
Node selection setting of build A's master step matches any agent node. The master step parallely executes step1 whose node selection setting matches server node, and step2 whose node selection setting matches any agent node.

Assume no builds are currently running on the grid when build A starts. The master step will select agent1 since it is faster. Step1 will run on server node. When build B starts, its master step will run on agent2 as agent1 has been occupied by build A. Its step1 will wait until step1 of build A finishes on server node, while step2 will execute immediately on agent2 as existing steps running on agent2 also belongs to build B.

Possible Deadlocks

Concurrent builds can result in deadlocks if steps are configured inappropriately. Assume below case:

Grid contains server node, build agent1, and build agent2.
Node selection setting of build A's master step matches agent1. The master step contains step1 whose node selection setting matches agent2.
Node selection setting of build A's master step matches agent2. The master step contains step1 whose node selection setting matches agent1.

When build A and B starts at the same time, agent1 will be occupied by A's master step, and agent2 will be occupied by B's master step. Then A's step1 tries to acquire agent2 while B's step1 tries to acquire agent1. This result in an endless waiting loop that:

A's step1 will not run unless agent2 is available.
Agent2 will not be available unless B's master step finishes.
B's master step will not finish unless B's step1 finishes.
B's step1 will not run unless agent1 is available.
Agent1 will not be available unless A's master step finishes.
A's master step will not finish unless A's step1 finishes.
...

In this case, build A and B will never finish unless they are manually cancelled or timed out. This deadlock may also happen if agent1 is extended to a group of agents, and agent2 is extended to another group of agents, that is, A's master step matches agent group1 and A's step1 matches agent group2, while B's master step matches agent group2 and B's step1 matches agent group2. The root cause for this deadlock is that availability of a group of agents recursively depends on availability of themselves. You should avoid this recursive dependency when set up your builds.

Build Concurrency

More Build Agents, More Concurrency

Possible Deadlocks

Labels