detections-per-second-2017

51Degrees: More Than Three Times Faster!

Engineering

10/2/2017 9:00 AM

Device Detection Performance C Development Device Data News

51Degrees compares performance of our enhanced algorithm on three different commodity hardware platforms.

51Degrees is regularly evaluated against competitors DeviceAtlas and WURFL from ScientiaMobile for performance, accuracy and memory consumption. Plenty of resources to help engineers evaluating device detection solutions are available on this web site including migration guides plus API benchmarks. This blog post builds on the previous performance blog from 2016 with the same commodity hardware; 1) a Raspberry Pi, 2) a low-end desktop PC and 3) a high-end multi CPU server, but using our enhanced algorithm.

Enhanced Algorithm

In September 2017 we announced the launch of our new enhanced device detection algorithm. Delivering performance improvements with faster detections per second and reduced memory requirements compared to other high-performance algorithms. The enhanced algorithm offers better than 99.9% matching accuracy, 114 MB fully initialised main memory requirement and a caching overhead as low as 1MB of memory. This amounts to a marked increase in performance, as detailed further in our press release. The enhancement is easy to deploy for developers as it uses the same interface as our existing APIs. If you would like to find out more, please Contact Us.

Performance Matters

Device detection only provides benefits if it's so fast there is no noticeable overhead. It needs to be accurate to correctly identify the device, browser and operating system. As such the number of device combinations (device, operating system and browser version pairings) is an important measure of how comprehensive a device detection solution is. However, the larger the number of device combinations the larger the data file and therefore the more data needs to be searched in a finite period of time. This can lead to large demands on computer hardware in computation and sufficient memory. This blog post examines the performance characteristics of 51Degrees' enhanced algorithm which overcomes these issues.

The Test

The performance test was carried out using 3 commodity hardware platforms, the same ones as before;

  1. A Raspberry Pi with a Quad Core 1.2GHz Broadcom BCM2837 CPU and 1GB RAM running Rasbian Jesse Lite.
  2. A low-end desktop PC with a 3Ghz Intel Core 2 Duo and 4GB of RAM running Ubuntu 16.04 LTS.
  3. A High end server with Dual 2.2Ghz Intel Xeon E5-2660 v2 10 Core CPUs and 160GB of RAM running Windows Server 2012.

The source data used on all three platforms is a random sample of 1 million User-Agents representative of real world data, a sample of 10 million User-Agents from the previous test was also used in benchmarking platform three, the high-performance server. In all cases, the source data was loaded into memory so as to negate any variables caused by disk caching.

Results

Platform 1

Enterprise

Performance (detections per second)

1,245,776

Average time for a single detection per core (ms)

0.003211

A Raspberry Pi model 3 - 1.2GHz 64-bit quad-core ARMv8 CPU and 1GB of RAM running Raspbian Jessie Lite. Price: $35

The Raspberry Pi was tested with 1 million User-Agents and the Enterprise data file. During the previous round of tests, we found an issue due to memory constraints where the Raspberry Pi's one gigabyte of RAM fell just short of the roughly 1.2 gigabytes needed to run the full tests using the Enterprise Data File and the old algorithm without the new enhancements. As the enhanced algorithm is very memory efficient, we had no trouble running the full test.

Platform 2

Enterprise

Performance (detections per second)

3,432,278

Average time for a single detection per core (ms)

0.000583

A 3Ghz Intel Core 2 Duo with 4GB of RAM, running Ubuntu 16.04 LTS. Price: $200

At such a low cost, the results on platform 2 show that even with ageing hardware, you can still achieve above par device detection performance.

Platform 3

Enterprise

1 Million

10 Million

Performance (detections per second)

21,478,447

22,486,344

Average time for a single detection per core (ms)

0.000931

0.000889

Dual 2.2Ghz Intel Xeon E5-2660 v2 10 Core CPUs with 160GB of RAM, running Windows Server 2012. Price: $3000

As Platform 3 has a significantly large amount of memory, we also tested using the same sample of 10 million User-Agents from the previous round of performance tests in November 2016. The difference in performance between 1 and 10 million User-Agents is characterised by the sample data itself, shorter User-Agent strings can be faster to evaluate. A larger number of short User-Agents strings, usually those of bots and crawlers, could bring down the average time to evaluate the source data.

Comparison

detections-per-second-2017
Detections per Second for each platform

Detections Per Dollar

detections-per-dollar-2017
Detections Per Dollar Chart

Again, despite the performance difference shown in the previous graph, the chart shows that the lowest cost device is the most efficient in terms of computing power.

Conclusion

In the previous round of tests, we noticed that the Raspberry Pi has issues with thermal throttling when running high intensity workloads. While we did find a cooling solution for the Pi, this setup is still unlikely to be used in a production environment.

rpi-cooling-600px
Raspberry Pi cooling solution

Platform 2, a low-end desktop PC, has shown that even with ageing hardware (the cpu is 9 years old) you can still achieve decent results. Testing has shown that, compared to our previous run of tests in November, the enhancements have increased detections per seconds around 3 times on average. This means that the overhead of integrating 51Degrees device detection libraries into your application will be even lower than before.

previous-vs-enhanced-detections-per-second
Performance increase.

For businesses like Adtech companies processing 10s of billions of advertising events per day, this enhancement will translate into cost savings and increased profits through a large increase in capacity and efficiency improvements.