Part 2: Underscores_and Brackets [ ]

Purpose

This is the second article in a series about how to write ESSENCE free-text queries. Free-text coding in ESSENCE, which is accessible to all users, follows distinct patterns. Learning to read these patterns allows users to take queries from many places and repurpose them to suit their unique needs. Syndromic surveillance depends heavily on sharing methods, and practitioners must understand the language.

Free Text Coding blue fractal with white text

Introduction

Please see Part 1: Wildcards for background information about the search criteria for ESSENCE free-text queries, which are built around Boolean logical operators and regular expressions.

Underscores and brackets are alike in that each represents a single position in a string of numbers, text, or symbols. An underscore is a placeholder that specifies that something, anything at all, must be in a particular position. A bracket allows the user to specify what options can appear in that position.

How underscores work

An underscore is a placeholder indicating that something must be present in the position.

Let's assume the following Chief Complaints (CC), previously used in Part 1 Wildcards, and a desire to create a query for fall-related injuries.

  1. Fall
  2. Fell getting out of car
  3. Left arm injury; Fall
  4. Falling out with friends; Suicidal
  5. Feels crestfallen
  6. Patient brought in after falling on face
  7. Fall; Left wrist injury
  8. Feels congested; Allergies

You may reasonably assume the boldfaced CCs 1, 2, 3, 6, and 7 are the intended cases and 4, 5, and 8 are false positives.

Here's a table that shows how the use of the underscore can affect results:

Code

Description

^Fall_^

An underscore at the end of a text string means something must follow the text. Returns CCs 4, 5, 6, and 7. Notice the underscore takes the place of an “e” in CC 5 and a “;” in CC 7.

^_Fall^

This underscore at the beginning specifies that something must be before the text. Returns CCs 3, 5, and 6. Notice the underscore takes the place of a space in CCs 3 and 6.

^_Fall_^

This specifies something must precede and follow the text. Returns CCs 5 and 6.

^F_ll^

A common use of underscores, this underscore is in the middle of a text string and specifies that something must be in that spot. Returns CCs 1, 2, 3, 4, 5, 6, and 7.

^F__ll^

Underscores can be used in series. Two in a row means something must be in both spots. Returns CCs 1, 2, 3, 4, 5, 6, and 7. Moving outside of our example CCs, this code would also return the text strings, Feel, Leflower, Fail, and potentially many others.

Underscore considerations

  • To query the underscore symbol, place it in brackets: ^[_]^
  • Underscores are useful in ICD-10-CM queries when the query coder cares about what's in later digits of the code but does not need to specify what comes before it.

In the following example, the two underscores in a row indicate that in this definition the "2" in the sixth place indicates intentional self-harm and is the important part of that diagnosis code:

  • ^T39.__2^

It is common practice to include decimal and non-decimal forms of ICD-10 codes in syndrome definitions and free-text code. Use caution when doing this.

If we delete the decimal in the ^T39.__2^ example to create a non-decimal query ^T39__2^, we would pick up the intended non-decimal form ICD-10 codes like T39012A and T392X2A—but the query would also capture unintended codes like the ICD-10 codes T39.121A or T39.829A. Luckily, this isn't relevant for our "intentional self-harm" example because there are no valid T39 ICD-10 codes with a 2 in the fifth place, but this isn't true for all codes and could result in the inclusion of unintended visits in your query.

How brackets work

Brackets are placeholders that specify items that can be present in a position.

Here's a table that shows how the use of brackets can affect results:

Code

Description

^[ ;t]Fall^

The brackets near the start mean the text “Fall” must be preceded by a space, semicolon, or a “t”. One and only one of the two bracketed items must precede the term. Returns CCs 3, 5, and 6.

^Fall[ ;i]^

The brackets near the end mean that one of the bracketed items must follow the term. Returns CCs 6 and 7.

^[ ;t]Fall[ ;i]^

The brackets before and after the term indicate that one of the bracketed items must precede and follow the term. Returns CC 6.

^F[ae]ll^

This bracket falls in the middle of a text string. It indicates that an “a” or an “e” must take that place. Returns CCs 1, 2, 3, 4, 5, 6, and 7.

Bracket considerations

Brackets can be troublesome. If your query returns null results, your brackets might be the source of the issue.

Brackets can contain ranges like [a-f] or [4-8]. Following are all equivalent bracket statements:

  • [0-2] = [012]
  • [0123456789] = [0-9] = [0-23-9]
  • [a-c] = [abc]
  • [0-3a-d] = [0123abcd]

Bracketed letters are not case sensitive.

Brackets can only take the place of one position:

  • You cannot bracket a number greater than 1 digit.
  • [0-24] does NOT give the numbers zero through twenty-four; instead, it will be interpreted as zero through two or four or as the equivalent bracket [0124].

Ranges cannot be backwards or looped. Examples of invalid brackets: [9–2] and [x–a].

Querying of certain punctuation will only work within brackets. Examples:

  • [COMMA] to search for commas
  • [_] to search for underscores

Examples

Brackets are commonly used in discharge diagnosis (DD) queries. DD codes frequently begin with a bracketed list of punctuation to ensure you get the start of a DD code and that your queried text won't appear in the middle. Underscores are also useful in DD queries. Note the use of both the brackets and underscores in the following excerpt from the CDC Firearm Injury v1 CCDD Category:

  • .....(,^[;/ ]W3[23]^,andnot,(,^[;/ ]W3[23]___[DS]^,or,^[;/ ]W3[23].___[DS]^,),),.....

This excerpt uses underscores in its negations. A "D" or an "S" in the seventh place of this ICD-10 code indicates that this is a subsequent or sequalae visit and is not an acute firearm injury ED visit. By using a series of underscores, the writers could specify what they didn't want in the seventh place without specifying the middle part of the DD code that came before it.

The ICD-10 codes in this example specify that the code must be preceded by a semicolon, forward slash, or space. Note the bracketed [23]: this will capture multiple forms with a single statement. For example, the following ESSENCE free-text codes are equivalent, with the second being more concise:

  • ^;W32^,or,^/W32^,or,^ W32^,or,^;W33^,or,^/W33^,or,^ W33^ = ^[;/ ]W3[23]^

Try it out!

Run these two queries in a Discharge Diagnosis field. Notice how the results are the same:

  • ^;W32^,or,^/W32^,or,^ W32^,or,^;W33^,or,^/W33^,or,^ W33^
  • ^[;/ ]W3[23]^

Take a simple query and experiment by replacing letters with an underscore or a bracketed list. You might find unexpected results when comparing ^Tick^ versus ^T_ck^ versus ^T[aeiou]ck^ or when comparing ^Heroin^ versus ^Her__n^ versus ^her[io][oi]n^

We thank Senior Data Analyst Zachary Stein for volunteering to write a series of articles about free-text coding. Stein does epidemiological work to support NSSP efforts and is an active participant in our NSSP Community of Practice (CoP). He initially wrote about free-text coding as an entry on the NSSP-CoP Syndrome Definition Committee forum. The forum generated considerable interest, inspiring this series. Stein acknowledges input provided by others who contributed to the forum post.