Link Plus

Link Plus is a record linkage tool for cancer registries. It can run in two modes—

  • To detect duplicates in a cancer registry database.
  • To link a cancer registry file with external files.

Although originally designed to be used by cancer registries, the program can be used with any type of data in fixed width or delimited format.

Selected Features in Version 2.0

  • Supports North American Association of Central Cancer Registries (NAACCR) file, fixed width file, delimited file, and CRS Plus database.
  • Computes probabilistic record linkage scores.
  • Handles missing values of matching variables by treating null or empty values as missing data automatically, and allows the user to indicate additional values to treat as missing data.
  • Facilitates OR blocking by indexing the variables for blocking and comparing the pairs with the identical values on at least one variable.
  • Offers a choice of two phonetic coding systems (Soundex and NYSIIS), as well as several variable-specific matching methods that find partial, approximate, or fuzzy matches.

Link Plus provides the following matching methods—

  • Value-specific (frequency-based): Sets weights for matching values based on the frequencies of values in the files being compared. A match on a frequent value is associated with a low weight, but a match on a rare value is associated with a high weight.
  • Last name and first name: Incorporates both partial matching and value-specific matching and NYSIIS phonetic code to account for minor typographical errors, misspellings, and hyphenated names. For first names, nicknames are matched with formal names.
  • Middle name: Accounts for occurrence of the middle initial versus the full middle name.
  • Date: Incorporates partial matching on separate date components, and accounts for transposition of date components, as well as missing month or day values.
  • Social Security number: Accounts for typographical errors and transposition of digits. Also matches a 9-digit number in one file with a 4-digit number in another file.
  • Generic string: Uses an edit distance function and incorporates partial matching to account for typographical errors.
  • ZIP Code: Enables the match between a 9-digit ZIP Code and a 5-digit ZIP Code.

Selected Features in Beta Version (3.0 Beta)

External Data Linkage

  • The program works for any number of records in file 2 as long as the computer has sufficient memory to read in data from file 1, which has a limit of 4.5 to 4.8 million records.
  • Provides a new Best Match option to choose whether to write all potential matches (many:many linkages) or only the matches with the highest score to the linkage report (one:many linkages).
  • Allows users to choose whether to output a non-match file.
  • Provides confirmation-like method for variables like address that contribute positive weight for the linkage score with agreement, but 0 weight with disagreement.
  • Provides Social Security number-like matching method for a generic ID.
  • Provides a new name matching method that is more robust against the frequency of names or outlier names such as misspelled names.
  • Allows variables to be selected as matching variables multiple times to perform array comparisons automatically.
  • Allows users to provide their own name frequency files for use by name matching methods.

Manual Review

  • Allows users to use “Assign Set ID” to group matches into mutually exclusive match sets.
  • Allows 300,000 pairs on manual review forms.

Export

  • Allows users to export the results of manual review to a delimited format file or to a fixed-width file (including a NAACCR-formatted file).
  • Allows users to export all non-matching records from the linkage in a single export file. The export file includes all records from file 2 that generated a linkage score lower than the specified cutoff value for the linkage on the Linkage Configuration window, and any records that have been assigned a false match status upon review of potential matches on the Manual Review window.

System Requirements

Registry Plus programs work with 32- and 64-bit Microsoft® Windows® operating systems. Additional system requirements include—

  • Operating system: Microsoft® Windows® 7, 8, 10, or more recent.
  • System memory: 2 GB or more.
  • Minimum free disk space: 1 GB for file operations (additional disk space commensurate with data files).

Installing and Upgrading Link Plus

Before you install or upgrade Link Plus, please read the following information.

  • This version of Link Plus was designed to upgrade older application versions automatically while preserving configuration settings.
  • If you are upgrading or reinstalling an existing version of Link Plus, you will need to uninstall the existing Link Plus program before you install version 2.To uninstall Link Plus, click Start, All Programs, Registry Plus, Link Plus, Uninstall Link Plus. Finally, click Yes (you are sure you would like to uninstall this product), and Link Plus will be uninstalled. After the previous version of Link Plus is uninstalled, the new production version may be installed.
  • The files and folder structure are preserved after you uninstall the previous version of Link Plus. Install this version in the same folder as your previous version of Link Plus to ensure the database and configuration settings are upgraded properly.
  • If you are installing Link Plus for the first time, you may install it on any drive, but it must be installed at the root level, not in a subfolder.

Installing Version 2.0

  1. Download Link Plus, RPLinkPLus_2.0.exe (executable file, 20.7 MB, June 29, 2007) to your computer.
  2. Open the downloaded file. The installation program will direct you through the steps for installing Link Plus. If you are a first-time user, we recommend you select the defaults.
  3. When installed, click on the Windows Start button and select Programs, then Registry Plus, then Link Plus.

Note: When first installed, Link Plus includes test data files and configuration files for demonstration purposes. To run the demonstrations, go to the File menu and select Open Configuration File. From the configuration folder, select the file FWDemo.cfg to perform linkage on two fixed-width files, or DelimDemo.cfg to perform linkage on two delimited files. You can demonstrate additional features of the program by making changes on the configuration form and saving your new configurations under different file names.

Installing Version 3.0 Beta

To obtain a copy of the Link Plus version 3.0 beta program, please contact cancerinformatics@cdc.gov.