We know that **Ci+1 **is dependent on previous carry Ci as follow relation:

C_{i}+1 =A_{i} B_{i }+ A_{i} C_{i} +B_{i} C_{i}

which can be written as C_{i}+1 = G_{i} + P_{i} C_{i}

G_{i} is called carry generate function as it generates carry when A_{i} =1 & B_{i} =1 and P_{i} is called carry propagate function because it propagates the carry when we have A_{i} =1 or B_{i} =1. Using these G_{i} and P_{i }we can get following equations:

C2=G1+P1*C1

C3=G2+P2*C2= G2 + P2*(G1+P1*C1)=G2+G1*P2 + P1*P2*C1

C4=G3+P3*C3=G3 + P3*(G2+G1*P2 + P1*P2*C1)= G3+G2*P3 + G1*P2*P3 + C1*P1*P2*P3

C5=G4+P4*C4= G4 + P4*(G3+G2*P3 + G1*P2*P3 + C1*P1*P2*P3)

= G4 + G3*P4 + G2*P3*P4 + G1*P2*P3*P4 + C1*P1*P2*P3*P4

These equations suggest that C2, C3, C4, C5 can be calculated from C1 directly. Hence it is called carry look ahead adder. This is a 4 stage circuit.

We have AND gates at level 1 and OR gate at level 2 in the circuit. Also fan-in of the OR gate in level 2 & that of AND gate is 5 and we have a maximum fan-in of about 8 So we can’t extend this circuit to higher stage carry look ahead but can use this 4-stage circuit in cascaded form. In the following diagram we have cascaded two 4-stage circuits to make it for 8 bit adder.

So if we have all Gi and Pi and C1 available then we can calculate all the carries only in gate delay equal to 2Δ and we can obtain all Gi and Pi from the inputs in 1 gate delay(1Δ ). Hence we can calculate the carry in 3 gate delays (3Δ) for 4-stage circuit to calculate C4 and to obtain the S5 we need 3 gate delays (3Δ). Hence we need a total of 6 gate delays (6Δ) for 4-stage CLA circuit.

For a 16-bit adder we need total delay= Δ + 2Δ + 2Δ + 2Δ + 2Δ + 3Δ = 12Δ which is also illustrated below:

so we see that we have been able to reduce the delay for a 16 bit adder from 33Δ to 12Δ which is lesser by a factor of about 3 times.