Free-text Coding in NSSP–ESSENCE: Part 3

Part 3 - Inclusionary Terms

This is the third article in the series about how to write ESSENCE free-text queries. We thank Senior Data Analyst Zachary Stein for developing this series.

Part 1. Wildcards
Part 2. Underscores_and Brackets [ ]
Part 4. Exclusion Terms and Parentheses
Part 5. A “Starter” Fall-related Injury Query and Examples of Complex Queries
Part 6. Wrapping Things Up
Part 7. Additional Tips: ISBLANK and ISNULL
Part 8. Additional Tips: !TERM! Syntax

Introduction

The search criteria for ESSENCE free-text queries are built around Boolean logical operators and regular expressions. Free-text queries are not case-sensitive and may contain “^” for wildcards; “,” for multiple entries; “ISBLANK” to look for blanks; “ISNULL” to look for nulls; [COMMA] to look for commas; and operators “and,” “or,”, “andnot,” and parentheses “()” to define order and grouping. This series will cover all these topics in-depth.

Free-text queries are what makes syndromic surveillance practice, particularly practice using NSSP–ESSENCE, adaptable to different data sources and types. By using free-text queries, analysts and epidemiologists can exercise a high level of customization. They can quickly code free-text queries and rapidly respond to outbreaks, disasters, and events that unfold. Such capabilities empower users to customize queries to fit their level of data, ensuring accurate results.

Free-text coding in ESSENCE, which is accessible to all users, follows distinct patterns. Learning to read these patterns allows users to take queries from many places and repurpose them to suit their unique needs. Syndromic surveillance depends heavily on sharing methods, and practitioners must understand the language.

Part 3. Inclusionary Terms

Now that this series has covered basic query notation and use of Carets, Underscores, and Brackets, the next step is to string together query criteria in a series. Even though a syndrome definition could technically be a single term, the use of multiple terms allows for diversity in capturing different contents from a field like Chief Complaint (CC) text or forms of Discharge Diagnosis (DD) codes. This part covers the basic “OR” and “AND” statements linking multiple terms and recommends basic tips for using these statements.

“OR” statements

“OR” statements are the most basic way to use multiple query terms in ESSENCE. Many queries and data explorations start with a basic search for “SYMPTOM #1” OR “SYMPTOM #2.” In ESSENCE, an “OR” statement is automatically applied to terms separated by a comma, but a best practice is to include all operators to improve troubleshooting and readability when sharing syndrome definitions. ESSENCE queries are not case sensitive, and neither are the operators. Some ESSENCE users prefer to capitalize all operators to make them stand out, whereas other users prefer to leave operators lowercase. It’s all personal preference. However, operators should always be surrounded by commas, and the correct format of a query for THIS, THAT, or THERE follows:

THIS,OR,THAT,OR,THERE

Note the commas around each instance an operator is used.

Let’s assume the following Chief Complaints (used in Parts 1 and 2) and a desire to create a query for fall-related injuries.

  1. Fall
  2. Fell getting out of car
  3. Left arm injury; Fall
  4. Falling out with friends; Suicidal
  5. Feels crestfallen
  6. Patient brought in after falling on face
  7. Fall; Left wrist injury
  8. Feels congested; Allergies

You may reasonably assume the boldfaced CCs 1, 2, 3, 6, and 7 are the intended cases and 4, 5, and 8 are false positives.

Here’s a table that shows examples of the “OR” query:

“OR” Query Examples
“OR” Query Examples
Code Description
^Fall^
This is a simple query of a term surrounded by carets taken from Part 1. It returns CCs 1, 3, 4, 5, 6, and 7.
Fall,OR,Fell getting out of car,OR,Left arm injury; Fall,OR,Patient brought in after falling on face,OR,Fall;Left wrist injury This extremely long query lists the CCs exactly as they appear and accomplishes our goal to include CCs 1, 2, 3, 6, and 7 while excluding CCs 4, 5, and 8. Although it works for this very small sample of CCs, it almost certainly will not work well on the broad range of real syndromic data. It’s also long and hard to read.
^Fall^,OR,^Fell^
This covers both forms of “Fall” in our CCs. Returns CCs 1, 2, 3, 4, 5, 6, and 7.
^ Fall^,OR,Fall ^,OR,Fall,OR,^Fell^
The first term removes the false positive “Crestfallen,” and the second term captures visits where the term starts the CC. The third term returns the stand-alone fall term, and the fourth term returns all visits containing “Fell.” Returns CCs 1, 2, 3, 6, and 7.

 

Sometimes an analyst or epidemiologist will want to code a text string to appear by itself and not in the middle of another word. The code may be a series of terms associated with “OR” statements. For example, the code ^ Fall ^ surrounds the term in spaces, ensuring the term doesn’t appear in the middle of a text string like “crestfallen.” This query would miss any visit where the queried variable has the text string “Fall” at the start or end of the field because, in these instances, it wouldn’t be preceded and followed by a space. To capture these other forms, try including the code in four ways:

^ Fall ^,OR,Fall,OR,^ Fall,OR,Fall ^

In this query, ^ Fall ^ specifies it can’t appear in the middle of a word, and ,Fall,OR,^ Fall,OR,Fall ^ specifies any time the term appears by itself as the only field text at the end of a field and at the beginning of a field, respectively

Speaking of multiple forms, part 2 of this series touched on ways to include different forms of ICD9 and ICD10 codes. If your query term for an ICD9 or ICD10 code goes beyond the decimal place, a best practice is to include the decimal and non-decimal form of the code separated by an “OR” statement. Some facilities omit the decimal from the ICD code, which can change how your query performs. By including both forms, you ensure the query picks up as many relevant visits as possible. An example is the use of ICD9 code E888.9 and ICD10 code W19.XXXA, both concerning an unspecified fall:

^E888.9^,OR,^E8889^

^W19.XXXA^,OR,^W19XXXA^

“AND” statements

“AND” statements are another way to use multiple query terms to craft a syndrome definition in NSSP–ESSENCE. Unlike “OR” statements that specify at least one criteria be met, terms linked by an “AND” statement look for visits that meet more than one criteria. “AND” statements do not specify the order the terms appear or location within the field queried. “AND” operators follow the same format as “OR” statements and must be surrounded by commas. The correct query format for THIS, and THAT, and THERE follows: THIS,AND,THAT,AND,THERE

Let’s assume the following Chief Complaints, used in Parts 1 and 2, and a desire to create a query for fall-related injuries.

  1. Fall
  2. Fell getting out of car
  3. Left arm injury; Fall
  4. Falling out with friends; Suicidal
  5. Feels crestfallen
  6. Patient brought in after falling on face
  7. Fall; Left wrist injury
  8. Feels congested; Allergies

You may reasonably assume the boldfaced CCs 1, 2, 3, 6, and 7 are the intended cases and 4, 5, and 8 are false positives.

“AND” Query Examples
“AND” Query Examples
Code Description
^Fall^,AND,^Injur^
This “AND” statement requires the term Fall appear alongside the term Injur. Returns CCs 3, and 7.
^Fall^,AND,^Face^
This query requires the face be mentioned alongside the term fall. Returns CC 6 ONLY.
^Fall^,AND,^Fall^
Be careful of duplicative “AND” statements. This does not guarantee the text appears twice in the query. Each can be thought of as a separate search of the query and the return is the overlap. Returns CCs 1, 3, 4, 5, 6, and 7.
^Fall^,AND,(,^Injur^,OR,^Face^,)
This is a simple combination of “AND” and “OR” statements grouped by parentheses. Parentheses will be covered later in the series, but like the rules of math, the “AND” statement means the term “fall” must appear alongside at least one term from within the parenthetical statement. Returns CCs 3, 6, and 7.

 

Here are several examples that show how “AND” and “OR” statements can be used in practice:

  • This example is from the CDC Heroin Overdose v4 CCDD Category. Like the fall-related ICD example shown previously, notice how this excerpt uses “OR” statements to code multiple DD forms and also includes both decimal and non-decimal terms.
    …..^[;/ ]T40.1X1A^,OR,^[;/ ]T401X1A^,OR,^[;/ ]T40.1X4A^,OR,^[;/ ]T401X4A^,OR,^[;/ ]965.01[;/]^,OR,^[;/ ]96501[;/]^,OR,^[;/ ]E850.0^,OR,^[;/ ]E8500^,or,…..
  • The next example is from the CDC Legionella v1 CCDD Category. Notice how different spellings of Legionella/Legionnaire’s Disease are included with “OR” statements to ensure relevant visits are captured.
    ^LEGIONNAI^,or,^LEGIONE^,or,^LEGIONA^,or,^PONTIAC^FEVER^,…..
  • These next two examples are taken from the CDC Firearm injury V1 and Norovirus v1 CCDD Categories, respectively. Both include parenthetical statements but can be expanded into (much less compact) equivalent codes.
    (,^hit^,or,^ricochet^,or,^graze^,),and,(,^bullet^,) is equivalent to ^hit^,AND,^bullet^,OR,^ricochet^,AND,^bullet^,OR,^graze^,AND,^bullet^
    (,^stomach^,and,(,^bug^,or,^virus^,or,^ flu ^,)) is equivalent to ^stomach^,AND,^bug^,OR,^stomach^,AND,^virus^,OR,^stomach^,AND,^ flu^

NSSP–ESSENCE also applies “AND” and “OR” statements across different fields.

After typing or pasting your code into a free-text field, you may apply the “OR” statements by selecting fields from the box titled “Also apply the search string to” …

Chief Complaints query window screenshot

…and hitting select. “OR” statements applied correctly in the Selected Query Fields will look like this:

Screen capture of Query Fields window

“AND” statements between fields in NSSP–ESSENCE will look like this with separate criteria selected for each field:

screenshot of Selected Queries Fields window

 

Try it out!

Compare the results of the following queries in the CCDD field of ESSENCE:

^Fall^,OR,^W19.XXXA^,OR,^W19XXXA^

^Fall^,AND,^W19.XXXA^,OR,^Fall^,AND,^W19XXXA^

Try selecting other “Also apply the search string to:” options and then try pasting the query into more than one field to see how this alters results.

We thank Senior Data Analyst Zachary Stein for volunteering to write a series of articles about free-text coding. Stein, formerly with the Kansas Department of Environment and Health, does epidemiologic work to support NSSP efforts. Stein is an active participant in the NSSP CoP. He initially wrote about free-text coding as an entry on the NSSP CoP Syndrome Definition Committee forum. The forum generated considerable interest, inspiring this series. Stein acknowledges input provided by others who contributed to the forum post.