Part 3: Inclusionary Terms

Purpose

This is the third article in a series about how to write ESSENCE free-text queries. Free-text coding in ESSENCE, which is accessible to all users, follows distinct patterns. Learning to read these patterns allows users to take queries from many places and repurpose them to suit their unique needs. Syndromic surveillance depends heavily on sharing methods, and practitioners must understand the language.

Free Text Coding blue fractal with white text

Introduction

Please see Part 1: Wildcards for background information about the search criteria for ESSENCE free-text queries, which are built around Boolean logical operators and regular expressions.

Now that this series has covered basic query notation and use of carets, underscores, and brackets, the next step is to string together query criteria in a series. Even though a syndrome definition could technically be a single term, the use of multiple terms allows for diversity in capturing different contents from a field like Chief Complaint (CC) text or forms of Discharge Diagnosis (DD) codes. This part covers the basic "OR" and "AND" statements linking multiple terms and recommends basic tips for using these statements.

“OR” statements

"OR" statements are the most basic way to use multiple query terms in ESSENCE. Many queries and data explorations start with a basic search for "SYMPTOM #1" OR "SYMPTOM #2." In ESSENCE, an "OR" statement is automatically applied to terms separated by a comma, but a best practice is to include all operators to improve troubleshooting and readability when sharing syndrome definitions.

ESSENCE queries are not case sensitive, and neither are the operators. Some ESSENCE users prefer to capitalize all operators to make them stand out, whereas other users prefer to leave operators lowercase. It's all personal preference. However, operators should always be surrounded by commas, and the correct format of a query for THIS, THAT, or THERE follows:

  • THIS,OR,THAT,OR,THERE

Note the commas around each instance an operator is used.

Let's assume the following Chief Complaints (used in Parts 1 and 2) and a desire to create a query for fall-related injuries.

  1. Fall
  2. Fell getting out of car
  3. Left arm injury; Fall
  4. Falling out with friends; Suicidal
  5. Feels crestfallen
  6. Patient brought in after falling on face
  7. Fall; Left wrist injury
  8. Feels congested; Allergies

You may reasonably assume the boldfaced CCs 1, 2, 3, 6, and 7 are the intended cases and 4, 5, and 8 are false positives.

“OR” Query Examples

Code Description

^Fall^

This is a simple query of a term surrounded by carets taken from Part 1. It returns CCs 1, 3, 4, 5, 6, and 7.
Fall,OR,Fell getting out of car,OR,Left arm injury; Fall,OR,Patient brought in after falling on face,OR,Fall;Left wrist injury This extremely long query lists the CCs exactly as they appear and accomplishes our goal to include CCs 1, 2, 3, 6, and 7 while excluding CCs 4, 5, and 8. Although it works for this very small sample of CCs, it almost certainly will not work well on the broad range of real syndromic data. It’s also long and hard to read.
^Fall^,OR,^Fell^ This covers both forms of “Fall” in our CCs. Returns CCs 1, 2, 3, 4, 5, 6, and 7.
^ Fall^,OR,Fall ^,OR,Fall,OR,^Fell^ The first term removes the false positive “Crestfallen,” and the second term captures visits where the term starts the CC. The third term returns the stand-alone fall term, and the fourth term returns all visits containing “Fell.” Returns CCs 1, 2, 3, 6, and 7.

Sometimes an analyst or epidemiologist will want to code a text string to appear by itself and not in the middle of another word. The code may be a series of terms associated with "OR" statements. For example, the code ^ Fall ^ surrounds the term in spaces, ensuring the term doesn't appear in the middle of a text string like "crestfallen." This query would miss any visit where the queried variable has the text string "Fall" at the start or end of the field because, in these instances, it wouldn't be preceded and followed by a space. To capture these other forms, try including the code in four ways:

  • ^ Fall ^,OR,Fall,OR,^ Fall,OR,Fall ^

In this query, ^ Fall ^ specifies it can't appear in the middle of a word, and ,Fall,OR,^ Fall,OR,Fall ^ specifies any time the term appears by itself as the only field text at the end of a field and at the beginning of a field, respectively.

Speaking of multiple forms, part 2 of this series touched on ways to include different forms of ICD-9 and ICD-10 codes. If your query term for an ICD-9 or ICD-10 code goes beyond the decimal place, a best practice is to include the decimal and non-decimal form of the code separated by an "OR" statement. Some facilities omit the decimal from the ICD code, which can change how your query performs. By including both forms, you ensure the query picks up as many relevant visits as possible. An example is the use of ICD9 code E888.9 and ICD10 code W19.XXXA, both concerning an unspecified fall:

  • ^E888.9^,OR,^E8889^
  • ^W19.XXXA^,OR,^W19XXXA^

“AND” statements

"AND" statements are another way to use multiple query terms to craft a syndrome definition in NSSP–ESSENCE. Unlike "OR" statements that specify at least one criteria be met, terms linked by an "AND" statement look for visits that meet more than one criteria. "AND" statements do not specify the order the terms appear or location within the field queried. "AND" operators follow the same format as "OR" statements and must be surrounded by commas. The correct query format for THIS, and THAT, and THERE follows:

  • THIS,AND,THAT,AND,THERE

Let's assume the following Chief Complaints, used in Parts 1 and 2, and a desire to create a query for fall-related injuries.

  1. Fall
  2. Fell getting out of car
  3. Left arm injury; Fall
  4. Falling out with friends; Suicidal
  5. Feels crestfallen
  6. Patient brought in after falling on face
  7. Fall; Left wrist injury
  8. Feels congested; Allergies

You may reasonably assume the boldfaced CCs 1, 2, 3, 6, and 7 are the intended cases and 4, 5, and 8 are false positives.

“AND” Query Examples

Code Description

^Fall^,AND,^Injur^

This “AND” statement requires the term Fall appear alongside the term Injur. Returns CCs 3, and 7.
^Fall^,AND,^Face^ This query requires the face be mentioned alongside the term fall. Returns CC 6 ONLY.
^Fall^,AND,^Fall^ Be careful of duplicative “AND” statements. This does not guarantee the text appears twice in the query. Each can be thought of as a separate search of the query and the return is the overlap. Returns CCs 1, 3, 4, 5, 6, and 7.
^Fall^,AND,(,^Injur^,OR,^Face^,) This is a simple combination of “AND” and “OR” statements grouped by parentheses. Parentheses will be covered later in the series, but like the rules of math, the “AND” statement means the term “fall” must appear alongside at least one term from within the parenthetical statement. Returns CCs 3, 6, and 7.

Examples

Here are several examples that show how "AND" and "OR" statements can be used in practice.

The first example is from the CDC Heroin Overdose v4 CCDD Category. Like the fall-related ICD example shown previously, notice how this excerpt uses "OR" statements to code multiple DD forms and also includes both decimal and non-decimal terms:

  • ......^[;/ ]T40.1X1A^,OR,^[;/ ]T401X1A^,OR,^[;/ ]T40.1X4A^,OR,^[;/ ]T401X4A^,OR,^[;/ ]965.01[;/]^,OR,^[;/ ]96501[;/]^,OR,^[;/ ]E850.0^,OR,^[;/ ]E8500^,or,.....

The next example is from the CDC Legionella v1 CCDD Category. Notice how different spellings of Legionella/Legionnaire's Disease are included with "OR" statements to ensure relevant visits are captured:

  • ^LEGIONNAI^,or,^LEGIONE^,or,^LEGIONA^,or,^PONTIAC^FEVER^,.....

These next two examples are taken from the CDC Firearm injury V1 and Norovirus v1 CCDD Categories, respectively. Both include parenthetical statements but can be expanded into (much less compact) equivalent codes:

  • (,^hit^,or,^ricochet^,or,^graze^,),and,(,^bullet^,) is equivalent to ^hit^,AND,^bullet^,OR,^ricochet^,AND,^bullet^,OR,^graze^,AND,^bullet^
  • (,^stomach^,and,(,^bug^,or,^virus^,or,^ flu ^,)) is equivalent to ^stomach^,AND,^bug^,OR,^stomach^,AND,^virus^,OR,^stomach^,AND,^ flu^

Instructions

NSSP–ESSENCE also applies "AND" and "OR" statements across different fields.

After typing or pasting your code into a free-text field, you may apply the "OR" statements by selecting fields from the box titled "Also apply the search string to" ...

Chief Complaints query window screenshot
Select the appropriate fields for your "OR" statements.

…and hitting select. “OR” statements applied correctly in the Selected Query Fields will look like this:

Screen capture of Query Fields window
How "OR" statements appear.

"AND" statements between fields in NSSP–ESSENCE will look like this, with separate criteria selected for each field:

screenshot of Selected Queries Fields window
A look at "AND" statement fields in NSSP-ESSENCE.

Try it out!

Compare the results of the following queries in the CCDD field of ESSENCE:

  • ^Fall^,OR,^W19.XXXA^,OR,^W19XXXA^
  • ^Fall^,AND,^W19.XXXA^,OR,^Fall^,AND,^W19XXXA^

Try selecting other "Also apply the search string to:" options and then try pasting the query into more than one field to see how this alters results.

We thank Senior Data Analyst Zachary Stein for volunteering to write a series of articles about free-text coding. Stein does epidemiological work to support NSSP efforts and is an active participant in our NSSP Community of Practice (CoP). He initially wrote about free-text coding as an entry on the NSSP-CoP Syndrome Definition Committee forum. The forum generated considerable interest, inspiring this series. Stein acknowledges input provided by others who contributed to the forum post.