Taking brightness measurements is not as complicated as you might think. The most accurate method is through the use of a device called an integrating sphere. An integrating sphere
is a sphere (duh) coated with a diffuse reflective coating inside to allow scattering of the light input to the sphere such that the illumination at any point on the interior is uniform. This minimizes the effect of directionality due to optics or reflectors present on the bulbs being measured. The sphere has two holes in it, one for the bulb being measured, and one for the measurement. A baffle is placed inside the sphere as well between the light source and the measurement port so that light cannot directly impinge on the measurement port. The measurement is made by a light meter inserted into the port, and the system calibrated by placing a calibrated light source (with a known light output) in the sphere and taking a reading with the light meter. The light meter outputs a lux measurement that can be converted to lumens via the known calibrated light source. This generates a lux to lumens calibration factor for the system. I don't have a calibrated light source but the good news is that comparison among the bulbs doesn't require conversion to lumens, we can stick with comparing the lux readings.
So the method for this study is to simply screw the bulb into a light bulb socket on the end of a power cord, plug the cord into a kill-a-watt or similar power meter and stick the bulb into the integrating sphere. Making note of the lux measured on the meter and the power on the kill-a-watt. The higher the lux reading, the more lumens output by the bulb. The factor we are interested in though is the light output per watt, a measure of the efficiency of the bulb. Since LEDs and CLFs change their light output with temperature, I also took and initial measurement and then waited 5 minutes and took another to compare how the bulbs performed once they got hot. In the case of LEDs the effectiveness of the heat sink will be shown by how much or little the light output drops over time. A bulb that looses a lot of intensity with runtime signals a bulb with a poor thermal design and which will not last long and is not suited for use in an enclosed fixture.
One caveat is that this method does not correct for color, meaning lumens are typically scaled by the sensitivity of the human eye. So a red bulb would have to output more red photons to reach the same lumen output as a green bulb since the human eye is more sensitive to green light. Since these are all nominally "white" bulbs of various color temperatures (mixes of wavelengths) we'll assume that the lux meter and the human eye is nominally equally sensitive to the light output by the bulbs.