Skip directly to search Skip directly to A to Z list Skip directly to navigation Skip directly to site content Skip directly to page options
CDC Home

Registry Plus™ Link Plus Features and Future Plans

New Features in Version 2.0

  • Improved file import process.
  • Enhanced support for deduplication linkage.
  • Ability to use nicknames in the First Name matching method.
  • New and powerful manual review and file export functions.
  • Improved context-sensitive online help.

Features

  • Supports North American Association of Central Cancer Registries (NAACCR) file, fixed width file, delimited file, and CRS Plus database.
  • Computes probabilistic record linkage scores based on the theoretical framework developed by Fellegi and Sunter. (Fellegi IP, Sunter AB. A theory for record linkage. Journal of the American Statistical Association 1969;64:1183–1210).
  • Handles missing values of matching variables by treating null or empty values as missing data automatically, and allows the user to indicate additional values to treat as missing data.
  • Facilitates a simple and efficient blocking mechanism ("OR blocking") by indexing the variables for blocking and comparing the pairs with the identical values on at least one variable.
  • Offers a choice of two phonetic coding systems (Soundex and NYSIIS), as well as several variable-specific matching methods that find partial, approximate, or fuzzy matches.
  • Provides the following matching methods, or comparators (in addition to the exact matching method, several approximate matching methods find partial, approximate, or fuzzy matches, and are customized for the content of specific data items or types)—
    • Value-specific (frequency-based): Sets weights for matching values based on the frequencies of values in the files being compared. A match on a frequent value is associated with a low weight, but a match on a rare value is associated with a high weight.
    • Last name and first name: Incorporates both partial matching and value-specific matching and NYSIIS phonetic code to account for minor typographical errors, misspellings, and hyphenated names. For first names, nicknames are matched with formal names.
    • Middle name: Accounts for occurrence of the middle initial versus the full middle name.
    • Date: Incorporates partial matching on separate date components, and accounts for transposition of date components, as well as missing month or day values.
    • Social Security number: Accounts for typographical errors and transposition of digits. Also matches a 9-digit number in one file with a 4-digit number in another file.
    • Generic string: Uses an edit distance function and incorporates partial matching to account for typographical errors.
    • ZIP Code: Enables the match between a 9-digit ZIP Code and a 5-digit ZIP Code.

Future Plans

  • Perform professional usability study to maximize interface user-friendliness and effectiveness.
  • Develop a phone number matching method to handle a partial match on the last seven digits.
  • Develop an address matching method.
  • Provide frequency distribution of variables to help users identify missing values.
  • Convert to .NET.

The Link Plus Development Priority List is a list of development tasks prioritized by the NPCR Registry Plus development team. Each task is the direct result of meetings with the Registry Plus User Group (RPUG) as well as requests from individual cancer registries and leaders in the cancer registry field. For more information on Registry Plus or RPUG, please contact cancerinfo@cdc.gov.

Link Plus Development Priority List (updated January 26, 2011)
Completed Tasks
Released version 2.0
Added the following features to version 3.0:
  1. Provides the name-matching methods for multiple names (similar to array comparison)
  2. Allows users to save the setting for export
  3. Provides nonmatch report after manual review
  4. Removes the limit on the size of file 2
  5. Allows many-to-many linking
  6. Allows users to review up to 300,000 pairs of linked records on a single form
  7. Allows users to export fixed-width format files (including NAACCR format file)
  8. Supports NAACCR 12 format
Remaining Tasks
Test the Beta version for version 3.0
Provide the option of using day, month, and year separately
Allow additional date formats
Remove the limit on the size of file 1
Allow CRS Plus users to select additional variables for manual review
Add the feature of transporting view files that allow multiple users to do manual reviews on different computers
Implement the address comparison function in C++
Write a paper about Link Plus
Implement the ZIP Code comparison function in C++
Refine name comparators using the frequencies of names by sex
Professional usability evaluation for interface standardization
Provide phone number comparator
 
Contact Us:
  • Centers for Disease Control and Prevention
    Division of Cancer Prevention and Control
    4770 Buford Hwy NE
    MS K-64
    Atlanta, GA 30341
  • 800-CDC-INFO
    (800-232-4636)
    TTY: (888) 232-6348
    8am–8pm ET
    Monday–Friday
    Closed on Holidays
  • cdcinfo@cdc.gov
USA.gov: The U.S. Government's Official Web PortalDepartment of Health and Human Services
Centers for Disease Control and Prevention   1600 Clifton Rd. Atlanta, GA 30333, USA
800-CDC-INFO (800-232-4636) TTY: (888) 232-6348 - cdcinfo@cdc.gov
A-Z Index
  1. A
  2. B
  3. C
  4. D
  5. E
  6. F
  7. G
  8. H
  9. I
  10. J
  11. K
  12. L
  13. M
  14. N
  15. O
  16. P
  17. Q
  18. R
  19. S
  20. T
  21. U
  22. V
  23. W
  24. X
  25. Y
  26. Z
  27. #