Registry Plus Link Plus Features and Future Plans
New Features in Version 2.0
- Improved file import process.
- Enhanced support for deduplication linkage.
- Ability to use nicknames in the First Name matching method.
- New and powerful manual review and file export functions.
- Improved context-sensitive online help.
Features
- Supports North American Association of Central Cancer Registries (NAACCR) file, fixed width file, delimited file, and CRS Plus database.
- Computes probabilistic record linkage scores based on the theoretical framework developed by Fellegi and Sunter. (Fellegi IP, Sunter AB. A theory for record linkage. Journal of the American Statistical Association 1969;64:1183–1210).
- Handles missing values of matching variables by treating null or empty values as missing data automatically, and allows the user to indicate additional values to treat as missing data.
- Facilitates a simple and efficient blocking mechanism ("OR blocking") by indexing the variables for blocking and comparing the pairs with the identical values on at least one variable.
- Offers a choice of two phonetic coding systems (Soundex and NYSIIS), as well as several variable-specific matching methods that find partial, approximate, or fuzzy matches.
- Provides the following matching methods, or comparators (in addition to the exact matching method, several approximate matching methods find partial, approximate, or fuzzy matches, and are customized for the content of specific data items or types)—
- Value-specific (frequency-based): Sets weights for matching values based on the frequencies of values in the files being compared. A match on a frequent value is associated with a low weight, but a match on a rare value is associated with a high weight.
- Last name and first name: Incorporates both partial matching and value-specific matching and NYSIIS phonetic code to account for minor typographical errors, misspellings, and hyphenated names. For first names, nicknames are matched with formal names.
- Middle name: Accounts for occurrence of the middle initial versus the full middle name.
- Date: Incorporates partial matching on separate date components, and accounts for transposition of date components, as well as missing month or day values.
- Social Security number: Accounts for typographical errors and transposition of digits. Also matches a 9-digit number in one file with a 4-digit number in another file.
- Generic string: Uses an edit distance function and incorporates partial matching to account for typographical errors.
- ZIP Code: Enables the match between a 9-digit ZIP Code and a 5-digit ZIP Code.
Future Plans
- Perform professional usability study to maximize interface user-friendliness and effectiveness.
- Develop a phone number matching method to handle a partial match on the last seven digits.
- Develop an address matching method.
- Provide frequency distribution of variables to help users identify missing values.
- Convert to .NET.
The Link Plus Development Priority List is a list of development tasks prioritized by the NPCR Registry Plus development team. Each task is the direct result of meetings with the Registry Plus User Group (RPUG) as well as requests from individual cancer registries and leaders in the cancer registry field. For more information on Registry Plus or RPUG, please contact cancerinfo@cdc.gov.
Contact Us:
- Centers for Disease Control and Prevention
Division of Cancer Prevention and Control
4770 Buford Hwy NE
MS K-64
Atlanta, GA 30341 - 800-CDC-INFO
(800-232-4636)
TTY: (888) 232-6348
8am–8pm ET
Monday–Friday
Closed on Holidays - cdcinfo@cdc.gov


