How we test mobile phones

This is our thorough explainer on how we go about testing mobile phones.

A series of images showing various test elements.

This is one of our bigger and more in-depth tests as modern phones are essentially a laptop, camera, TV and telephone all rolled into one. As such, they require a series of rigorous and standardised tests.

Find out how to find the right mobile phone with our buying guide, or see our test results to see which we recommend.

We try to get a cross-section of brands and models available in the New Zealand market. We can’t guarantee that we’ll test every single model, but we do get a good representation of what’s available to buy from retailers.

Where a model is released in different storage sizes, we don’t test each variation (for example, the iPhone 11 comes in 64GB, 128GB and 256GB). We do evaluate the differences though and create a score for each variation.

Camera quality

(15% of the overall score)

Some parts of the camera test, including simulated outdoors and image stabilisation.
Some parts of the camera test, including simulated outdoors and image stabilisation.

The most detailed part of our testing looks at cameras, as this is the main selling point for most devices. The cameras are tested with a variety of different scenes, including moving subjects and a test pattern. Each mode of the cameras is tested including the selfie or front-facing camera.

For video, the image stabilisation effects are measured as well as zooming while recording and the quality of the sound.

All the tests are done with the phone on a stable tripod, except for the image stabilisation test where the phone is locked into a stable rig with a known rate of vibration added artificially.

The resulting images and video are transferred to a computer and compared using analysis software. There is also a subjective analysis done of the images.

The scenes tested are:

  • In daylight
    • still life in a studio with multiple different elements and textures
    • portrait with a mannequin in a simulation of outdoors; testing is done of HDR, face detection and bokeh (background blurring) effects
    • 4x zoom with a test pattern
    • front-facing camera “selfie” of a mannequin in front of a colourful background
    • wide-angle lens shot of mannequin as above
  • In low light – with and without flash
    • still life in a studio with multiple different elements and textures (a spinning disc is used to calculate exposure time)
    • front-facing camera “selfie” of a mannequin in front of a colourful background.

In addition to image quality, the lab also tests:

  • shutter delay (the time between pressing the shutter button and the photo being taken)
  • autofocus speed
  • start-up time.

Battery life

(15% of the overall score)

The battery life is tested with a robot arm using the phone and recreating a “typical user day”. The test continues until the battery is completely discharged.

The day is defined as:

  • stand-by (18.8 hours)
  • internet use (3h)
  • camera use – 5 pictures with main camera with 3 sec pauses, no flash
  • navigation use (0.5h) – scrolling on Apple Maps/Google Maps/Earth /Here Maps
  • calling use (1h) – calling in and out
  • two notifications per hour (display switches on/vibration alarm) are sent to the device being tested, with the notification displayed for 1 minute.

Every device is tested with two different brightness settings:

  • maximum brightness
  • display brightness set to 300 nits (a measure of luminance).

We check our measurements against the phone’s own usage statistics.

Display

(15% of the overall score)

The display quality is assessed with several objective and subjective tests.

Display size:
The easiest one. We measure and compare to the manufacturer’s claims.

Resolution:
We calculate resolution as the number of pixels in a given area. The manufacturer’s resolution claims (dots per inch, dpi) are checked using a binocular microscope. When a display has a resolution of more than 300 dpi, the average human eye cannot detect the pixel structure.

Brightness:
We use a luminance meter in low-light conditions. The display’s automatic brightness is switched off, the manual brightness is set to maximum and a pure white image is displayed, and the brightness measured from 0.6m (roughly an arm’s length).

Contrast:
Contrast is the ratio of luminance between the white and the black parts of a picture. This is measured under the same conditions as brightness. The test image is pure white on one side and pure black on the other.

Readability in various conditions:
Readability of the screen is evaluated from various angles and in various lighting conditions, including low light and bright sunlight. This part of the test also includes checking to see how easily the screen picks up fingerprints.

Performance

(15% of the overall score)

To test performance, we use standardised benchmark tests. However, there are drawbacks when attempting to do this and compare across all phones:

  • The tests differ depending on the operating systems.
  • If the benchmark tests are updated, then the results may not be comparable to previous tests.
  • In the past, some manufacturers’ phones have detected the benchmark test and tuned the processing speed to maximum, giving an unrealistic result.

We minimise these by:

  • using a set of reference phones (slower models as well as top models) to normalise future updates with the current values
  • running the benchmark test multiple times, back to back – in some cases, the devices get warm or even hot and their performance decreases
  • measuring the hottest part of the smartphone with an IR thermometer while performing the benchmark test – the warmer the phone, the worse the score. If the device shuts down automatically or reduces its functions (for example, no flashlight, no camera mode) this has an impact on the result.

Durability

(10% of the overall score)

The tumbler used in the drop test
The tumbler used in the drop test

The durability tests are designed to see how easily the phones break and whether they can be operated after breaking. For obvious reasons, this test is done last and we purchase two samples of each phone we test in case one breaks early.

Scratch resistance:
The tester attempts to scratch the display of the phone with a specialised hardness test “pencil” set to five different levels of force. The score is based on the maximum load that does not lead to permanent scratches on the device. The same test is applied to the glass on the camera.

Tumble test:
This test is designed to simulate dropping your phone on to a concrete surface from roughly waist or pocket height. The phones are turned on and placed into a tumbling Z-shaped drum with a drop height of 80cm on to a stone base.

The tumbler does 50 rotations (100 drops), with damage checked after 25 and 50 rotations. At each check, the functionality and usability of the phone is tested and scored.

Water resistance “rain” test:
The phone is switched on, connected to a network and placed horizontally on a rotating table under a rain simulator. The simulator gives an even distribution of water at 7.2L per hour. The phone is rained on for 5 minutes. The functionality is assessed immediately afterwards, then each day for three days.

Devices that claim to be waterproof (IPx7 or IPx8) are submerged to the stated depth and for the stated time (for example, 1.5m for 30 minutes). Again, the functionality is assessed immediately, and for three days afterwards.

Sound

(10% of the overall score)

Music quality
The test is conducted using high-quality headphones. The tracks used for the test are encoded in MP3 format at 192 kbps. Each phone is tested with the same three songs from the categories pop and classical.

Sound quality of built-in speakers.
Very often, smartphones are not able to reproduce low frequencies. This leads to a very thin sound. We play back tones and measure the ratio of energy in the lower bands in proportion to the energy of the more prominent bands. This ratio provides information about the sound balance and the bass frequency ratio.

Loudness measurement of the speakerphone:
The maximum undistorted loudness is measured at a distance of 1m in an anechoic acoustic chamber (a room with baffles on the walls to muffle reflected sound).

Ease of use

(10% of the overall score)

To determine the ease-of-use score, our experts assess the usability of the most common day-to-day aspects of the phone.

Touchscreen: The touchscreen evaluation includes the design, shape, usability with small/big fingers, blind operation, pressure point and feedback (sound or haptic touch).

The test includes swiping with fingers, using multi-touch gestures (for example, zooming with 2 fingers), swipe writing, using any supplied pen or stylus, and the copy and paste functions. We also evaluate any physical buttons on the phone.

To determine ease of typing, we time how long it takes to write the following passage from The Scotsman’s Return from Abroad by Robert Louis Stevenson.

At last, across the weary faem,
Frae far, outlandish pairts I came.
On ilka side o' me I fand
Fresh tokens o' my native land.
Wi' whatna joy I hailed them a' -
The hilltaps standin' raw by raw,
The public house, the Hielan' birks,
And a' the bonny U.P. kirks!
But maistly thee, the bluid o' Scots,
Frae Maidenkirk to John o' Grots,
The king o' drinks, as I conceive it,
Talisker, Isla, or Glenlivet!

Calling

(2.5% of the overall score)

An image showing the acoustic chamber and robot head used in the perceptual objective listening quality analysis or quot;sound testquot;.
The perceptual objective listening quality analysis or "sound test".

Perceptual objective listening quality analysis
This is how we test the sound quality of a call. A dummy head with an artificial ear is placed into an anechoic chamber (removing any outside noise) and a phone is attached to it. This set-up is used to record the incoming and outgoing sound from the phone.

Four different situations are tested:

  1. Sending: speech quality without ambient noise
  2. Sending: speech quality with pink ambient noise at 70 dB(A)
  3. Receiving: speech quality without ambient noise
  4. Receiving: speech quality with pink ambient noise at 70 dB(A)

The measurements with ambient noise are taken to make the measurement more realistic. Pink noise consists of all frequencies we can hear but is more intense at lower frequencies.

Reception

Reception is one of the most important qualities for a phone, hence the test is very technical.

image showing the equipment used during the reception test.
The reception test.

We use reference signal received quality (RSRQ) for evaluating reception. It’s defined as:

  • a ratio of the reference signal received power (RSRP) to the reference signal strength indicator (RSSI).

Or, more simply:

  • the strength of the signal at the phone divided by the strength of the signal at its source.

The phones are placed in a shielded chamber to remove outside influence and connected to a 4G test network.

The signal strength is reduced until the RSRQ is below -15dB, which is a very bad connection, and that strength is recorded. The measurement is repeated using five different frequency bands and four different orientations of the device in order to minimise the influence of how the antenna is set up.

Security (including privacy)

(2.5% of the overall score)

It’s important to be able to control your own data, hence privacy and security have to be considered when testing connected devices.

Security

The security section of the test includes a lot of elements.

There are several databases that track fixed and unfixed vulnerabilities. These databases are consulted for every device in the test. The test evaluates how the phone’s manufacturer handles password creation. The password ratings for the main account, and any associated brand accounts, are calculated using a formula that checks whether:

  • the user is encouraged to use upper and lower case letters, digits and symbols
  • there is a lower or upper limit for password length
  • the password strength is well explained, indicated, and checked against dictionaries.

We also check whether the manufacturer implements protection mechanisms to prevent brute force attacks on a password. These can include limiting the maximum number of incorrect entries or introducing an increasing cooldown rate between several tries.

A formula is used to get a rating for any biometric lock options (such as fingerprint or face unlock). It looks at the whether there are adequate explanations about the possible risks of using them.

We also evaluate whether:

  • the phone is encrypted after setting it up
  • it uses secure connections for sensitive data
  • it tries to trick the users into revealing more info than they wanted
  • the terms and conditions and privacy policy are provided in a clear way during registration.

The state of the device's encryption is checked in the settings or via developer options. During set-up and first use, we capture the device's network traffic. If there are unencrypted connections, the content is investigated for sensitive data like your phone’s unique IMEI or serial numbers.

After set-up, we reset the phone – rating how easy it is to do so – and check to see if there are any leftovers from the previous user’s data. Additionally, we evaluate any statement the manufacturer gives about its end-of-life policy.

Privacy

For the privacy section, we assess aspects of the phone that relate to requests for your information.

Part of the rating depends on how much data the user has to provide to the company when creating an account. The device receives a better rating if only an e-mail address is requested. It receives a lower rating when additional information – like name, gender or birthday – has to be entered.

During installation, our experts assess the different options the user has to choose from, including opt-in rather than opt-out, or deceiving buttons and confusing submenus (also described as “dark patterns”).

Terms and conditions and privacy policies are rated on how easy they are to understand, the length, layout and general user-friendliness.

Some smartphones are shipped with additional apps like virus scanners, speed boosters or help desk apps (sometimes called bloatware). We assess whether the user can remove or disable these extra apps. It is not considered bloatware if an Android phone provides alternatives to Google’s apps like an alternative app store, gallery or tools, such as calculators.

Handset capabilities

(5% of the overall score)

This section looks at the phone’s communication technology that isn’t part of making phone calls.

Web browser convenience
Our experts use each phone to browse to a set of popular websites and do the same tasks:

  • typing addresses
  • creating and using favourites
  • zooming in and out
  • navigating
  • checking JavaScript
  • setting bookmarks
  • loading pages in the background
  • using the history.

Web browser benchmark measurement The web browser performance of every smartphone is tested against two different benchmark tests:

  • Octane 2.0
  • Speedometer 2.0

The browser version is recorded for all phones.

GPS navigation

This tests how precisely the phone can figure out where you are at a given time.

Smartphones mostly use your mobile network to figure out where you are, by accessing a database of known WiFi networks and mobile network base stations. For an accurate localisation, within 5 to 10m, the phone uses a purely satellite system. A system called A-GPS (assisted GPS) is used for downloading maps and satellite layers, as well as location based additional information.

GPS test
With each phone, the testers drive the same 15km test track and record their position using pure GPS. We then look at the accuracy of the final recorded tracks, exact position versus displayed position and the smoothness of the signals, especially in areas with tunnels and weak satellite signal. Some phones use additional information from sensors like gyroscopes and compasses to interpolate weak satellite reception (for example, when driving through tunnels).

  • For Android: Maverick and GPS Status are used to track.
  • For iOS: the track analysis of “sports tracker” is used.

All phones are checked to see if they can receive signals form the following satellite systems:

  • GPS Receiver
  • GLONASS
  • Galileo
  • BeiDou

Member comments

Get access to comment