Loqate provides a bi-quarterly release. This release typically consists of updates to API, Knowledge base (GKR) and often also reference data. This page outlines major areas of testing essential to pass the criteria for the release.
Regression Tests
The purpose of the regression tests is to ensure that functionality that worked in previous releases in the current release under testing. In addition, any new functionality also will be tested.
Test Requirements
Key test areas are mainly quality of the results for the Verify and Geocode tools.
The 125,000 Test case set will be the main bench mark to perform regression testing. This test script contains test addresses from entire world including several addresses from the major countries like USA, European Countries, Japan, Australia and China.
The comparison of test results is performed for a given release in the following manner, where ‘current’ refers to the existing release and ‘new’ refers to the release being tested.
API | Knowledge Base | Reference Data | Notes |
---|---|---|---|
Current | Current | Current | Base for comparison |
New | Current | Current | |
New | New | Current | |
Current | New | New | Checking Data compatibility with older API |
New | New | New |
Test Environment
Loqate internal auto-testing environment will be used for testing a release. This internal test environment has the ability to let the tester select the following for a test:
- Test record to be used for the test
- API to be used for the test
- GKR Knowledge base to be used
- Reference data to be used
Performance Tests
The purpose of the performance tests is to quantify the responsiveness of the software and to ensure that functionality in a release has no degradation in performance and memory usage with respect to the previous release. There are broadly 2 areas:
- Valgrind tests ( to obtain CPU cycles and peak memory usage ), and
- Throughput tests.
Valgrind Test requirements
Key test areas are to obtain the CPU usage and memory usage when running Verify tool. CPU usage is measured in terms of CPU cycle. Memory usage is the peak memory used.
Tests that are run are the following:
Test record set | Type of data in the set | Test Script |
---|---|---|
us_10k | 10,000 records USPS CASS test dataset | runus.sh |
jp_10k | 10,000 records Japan only | runjp.sh |
cn_10k | 10,000 records China only | runcn.sh |
ru_10k | 10,000 records Russia only | runru.sh |
Valgrind Test Environment
- Test environment is Linux
- For CPU cycle count, callgrind from Valgrind suite of tools is used.
- For peak memory usage, massif from Valgrind suite of tools is used.
Valgrind Test Machine Details
Item | Value |
---|---|
OS | 64 bit Linux (2.6.32-358.2.1.el6.x86_64 #1 SMP ) |
CPU | AMD Phenom(tm) II X4 975 Processor, 4 cores |
Memory | 16 GB |
Disk | Output from the command : hdparm -Tt /dev/sdaTiming cached reads: 8278 MB in 2.00 seconds = 4140.72 MB/sec Timing buffered disk reads: 680 MB in 3.01 seconds = 226.29 MB/sec |
The full set of tests is run as follows:
- ./runcn.sh $1 &
- ./runus.sh $1 &
- ./runjp.sh $1 &
- ./runru.sh $1 &
The above tests output both CPU cycles from callgrind for the duration of the test as well as peak memory usage from the massif tool for the duration of the test.
Throughput Test requirements
Key test area is throughput which is number of records processed per second. 5 iterations will be run, the best and worst results discarded, and the remaining 3 averaged to give a throughput figure.
Test record set | Type of data in the set | Test Script |
---|---|---|
us_10k | 10,000 records USPS CASS test dataset | bpt.sh |
jp_10k | 10,000 records Japan only | bpt.sh |
cn_10k | 10,000 records China only | bpt.sh |
ru_10k | 10,000 records Russia only | bpt.sh |
Throughput Test Environment
- Test environment is Linux
- For throughput time measurement is obtained via the command “time”. Throughput = Number of records/time in sec
Throughput Test Machine Details
Item | Value |
---|---|
OS | 64 bit Linux (2.6.32-358.el6.x86_64 #1 SMP ) |
CPU | Intel i5 750, 4 cores |
Memory | 3.7 GB |
Disk |
Output from the command : hdparm -Tt /dev/sdaTiming cached reads: 20338 MB in 2.00 seconds = 10185.51 MB/sec Timing buffered disk reads: 382 MB in 3.02 seconds = 126.68 MB/sec |
The full set of tests is run as follows:
./bpt.sh -l <API_PATH> -d <DATA_FOLDER_PATH> -t <TEST_FILE_NAME> -f <INPUT_FORMAT_STRING> >> <RESULTS_FILE>
For example, the us_10k file may be run for 2013Q2.0.4939 API and Data as follows:
./bpt.sh -l /opt/loqate/releases/2013Q2.0.4939 -d /opt/loqate/releases/2013Q2.0.4939 /data -t /opt/loqate/testfiles/us_10k -f "Address1|Locality|AdministrativeArea|PostalCode|Country" >> ./2013Q2Results.txt
Acceptance Criteria
Regression Test criteria
Address Verification Code verification levels should show no unexpected drop in higher levels. If verification levels gain in higher levels against a loss in lower levels, that indicates improvement in the verification. Differences of larger than 1% should be investigated on a per-record basis to confirm that the majority are improvements.
GeoAccuracy geocode levels should show no unexpected drop in higher levels. If geocode levels gain in higher levels against a loss in lower levels, that indicates improvement in geocoding. Differences of larger than 1% should be investigated on a per-record basis to confirm that the majority are improvements.
Any changes requiring pre-record review will be accompanied by a textual description of the change, indicating whether it is a more accurate reflection of the quality of the input data, and the categorized reason for the difference.
The following changes will be classified as unacceptable for release:
- a data processing regression below the acceptance level over a previous version using a standard test file;
- a data processing regression on an exceptions file of previously reported issues;
Performance Test criteria
A decrease in throughput of greater than 5% will be unacceptable for release unless a significant increase in overall quality of results can also be documented.
Any significant changes in CPU usage or memory usage analysis will be documented.