Performance

Considerations

Loqate is designed to use an efficient compressed and encrypted data storage and retrieval system, caching sections of the verification reference data into memory as appropriate. As with most random-access data-based applications, it will perform better with a faster I/O system and more memory available for caching.

When batch processing data better performance will be achieved if the data is able to be pre-sorted, allowing the caching to better determine the best data to pre-fetch. If possible, it is recommended to sort input data by country, then region, then city. This can increase performance by up to 30% depending on the input data.

Loqate is a multi-threaded library and is able to make excellent use of multi-core CPUs and multi-processor systems. Each additional process adds approximately (100 – process count) percent, up to the limits of I/O speed and memory size. For instance, adding a second process on a multi-process or multi-core system will give around 80% performance increase, a third process around 70% performance increase, etc. as long as separate processing cores are available and I/O speed and memory size does not impede the processing speed.

The processing speed varies significantly based on the countries included in the input data, and is related to the size of the relevant verification reference dataset. For instance, while the reference dataset for the US is approximately 1.2Gb is size, the reference dataset for Greece is approximately 5Mb. The quality of the input data also has a significant effect on performance – lower quality input means that Loqate may have to evaluate more closely matching records within the relevant reference dataset, and also perform more complex searches.

Benchmarking

For benchmarking purposes Loqate uses the following setup:

  • System: Commodity PC
  • CPU: Single Quad-Core CPU (AMD Phenom II X4 965)
  • RAM: 16Gb (4x4Gb DDR3 1333)
  • OS: Ubuntu 10.04 LTS x86_64

On an evenly distributed worldwide data file running through the command line Batch API with standard caching options Loqate achieves approximately 3 million records per hour, or around 830 records per second.

Recommendations

There are two options that affect the amount of data the Loqate API will attempt to cache into memory, ReferenceDatasetCacheSize and ReferencePageCacheSize. The ReferenceDatasetCacheSize option specifies how many different datasets to maintain memory-based caches for, based on the most recently used datasets. For processing worldwide data it may be desirable to increase this from its default value of 20. However, it is recommended to sort any input data by country, in which case increasing this value would only increase memory usage for no benefit. The ReferencePageCacheSize option specifies how many individual file pages to cache from each individual dataset. When processing files from a single country, or a small number of countries, it may be desirable to increase this value up to 256 depending on available memory, and decrease the ReferenceDatasetCacheSize value to approx 8.

When running the command line Batch API, peak performance is attained when the total number of threads equals approximately twice the number of CPU cores available. On our benchmarking purposes we find the best performance when running 4 separate ‘lqtbatch -tc 2’ processes, meaning that we have 4 parallel processes, each running with 2 threads, making a total of 8 threads on a 4 core machine. This way the CPU usage stays as close to 100% as possible.