Free-text Coding in NSSP–ESSENCE: Part 2
This is the second article in the series about how to write ESSENCE free-text queries. We thank Senior Data Analyst Zachary Stein for developing this series.
The search criteria for ESSENCE free-text queries are built around Boolean logical operators and regular expressions. Free-text queries are not case-sensitive and may contain “^” for wildcards; “,” for multiple entries; “ISBLANK” to look for blanks; “ISNULL” to look for nulls; [COMMA] to look for commas; and operators “and,” “or,”, “andnot,” and parentheses “()” to define order and grouping. This series will cover all these topics in-depth.
Free-text queries are what makes syndromic surveillance practice, particularly practice using NSSP–ESSENCE, adaptable to different data sources and types. By using free-text queries, analysts and epidemiologists can exercise a high level of customization. They can quickly code free-text queries and rapidly respond to outbreaks, disasters, and events that unfold. Such capabilities empower users to customize queries to fit their level of data, ensuring accurate results.
Free-text coding in ESSENCE, which is accessible to all users, follows distinct patterns. Learning to read these patterns allows users to take queries from many places and repurpose them to suit their unique needs. Syndromic surveillance depends heavily on sharing methods, and practitioners must understand the language.
Part 2. Underscores_and Brackets [ ]
Underscores and Brackets are alike in that each represents a single position in a string of numbers, text, or symbols. An Underscore is a placeholder that specifies that something, anything at all, must be in a particular position. A Bracket allows the user to specify what options can appear in that position.
A placeholder indicating that something must be present in the position.
Let’s assume the following Chief Complaints (CC), previously used in Part 1 Wildcards, and a desire to create a query for fall-related injuries.
- Fell getting out of car
- Left arm injury; Fall
- Falling out with friends; Suicidal
- Feels crestfallen
- Patient brought in after falling on face
- Fall; Left wrist injury
- Feels congested; Allergies
You may reasonably assume the boldfaced CCs 1, 2, 3, 6, and 7 are the intended cases and 4, 5, and 8 are false positives.
Here’s a table that shows how the use of the _Underscore can affect results:
|_ Underscore Query Examples|
|^Fall_^||An Underscore at the end of a text string means something must follow the text. Returns CCs 4, 5, 6, and 7. Notice the Underscore takes the place of an “e” in CC 5 and a “;” in CC 7.|
|^_Fall^||This Underscore at the beginning specifies that something must be before the text. Returns CCs 3, 5, and 6. Notice the Underscore takes the place of a space in CCs 3 and 6.|
|^_Fall_^||This specifies something must precede and follow the text. Returns CCs 5 and 6.|
|^F_ll^||A common use of Underscores, this Underscore is in the middle of a text string and specifies that something must be in that spot. Returns CCs 1, 2, 3, 4, 5, 6, and 7.|
|^F__ll^||Underscores can be used in series. Two in a row means something must be in both spots. Returns CCs 1, 2, 3, 4, 5, 6, and 7. Moving outside of our example CCs, this code would also return the text strings, Feel, Left lower, Fail, and potentially many others.|
Notes on How to Use Underscores in Queries
- To query the Underscore symbol, place it in Brackets: ^[_]^
- Underscores are useful in ICD10-CM queries when the query coder cares about what’s in later digits of the code but does not need to specify what comes before it.
In the following example, the two Underscores in a row indicate that in this definition the “2” in the sixth place indicates intentional self-harm and is the important part of that diagnosis code: ^T39.__2^
It is is common practice to include decimal and non-decimal forms of ICD10 codes in syndrome definitions and free-text code. Use caution when doing this.
If we delete the decimal in the ^T39.__2^ example to create a non-decimal query ^T39__2^, we would pick up the intended non-decimal form ICD10 codes like T39012A and T392X2A—BUT, the query would also capture unintended codes like the ICD10 codes T39.121A or T39.829A. Luckily, this isn’t relevant for our “intentional self-harm” example because there are no valid T39 ICD10 codes with a 2 in the fifth place, but this isn’t true for all codes and could result in the inclusion of unintended visits in your query.
[ ] Brackets
Placeholders that specify items that can be present in a position.
Here’s a table that shows how the use of Brackets can affect results:
|[ ] Bracket Query Examples|
|^[ ;t]Fall^||The Brackets near the start mean the text “Fall” must be preceded by a space, semicolon, or a “t”. One and only one of the two bracketed items must precede the term. Returns CCs 3, 5, and 6.|
|^Fall[ ;i]^||The Brackets near the end mean that one of the bracketed items must follow the term. Returns CCs 6 and 7.|
|^[ ;t]Fall[ ;i]^||The Brackets before and after the term indicate that one of the bracketed items must precede and follow the term. Returns CC 6.|
|^F[ae]ll^||This Bracket falls in the middle of a text string. It indicates that an “a” or an “e” must take that place. Returns CCs 1, 2, 3, 4, 5, 6, and 7.|
Notes on Brackets
- Brackets can be troublesome. If your query returns null results, your Brackets might be the source of the issue.
- Brackets can contain ranges like [a-f] or [4-8]. Following are all equivalent Bracket statements:
- [0-2] = 
-  = [0-9] = [0-23-9]
- [a-c] = [abc]
- [0-3a-d] = [0123abcd]
- Bracketed letters are NOT case sensitive.
- Brackets can only take the place of ONE position:
- You cannot bracket a number greater than 1 digit.
- [0-24] does NOT give the numbers zero through twenty-four; instead, it will be interpreted as zero through two or four or as the equivalent Bracket .
- Ranges cannot be backwards or looped. Examples of invalid Brackets: [9–2] and [x–a].
- Querying of certain punctuation will only work within Brackets. Examples follow:
- [COMMA] to search for commas
- [_] to search for underscores
Examples that Use Underscore and Bracket
Brackets are commonly used in discharge diagnosis (DD) queries. DD codes frequently begin with a bracketed list of punctuation to ensure you get the start of a DD code and that your queried text won’t appear in the middle. Underscores are also useful in DD queries. Note the use of both the Brackets and Underscores in the following excerpt from the CDC Firearm Injury v1 CCDD Category:
…..(,^[;/ ]W3^,andnot,(,^[;/ ]W3___[DS]^,or,^[;/ ]W3.___[DS]^,),),…..
This excerpt uses Underscores in its negations. A “D” or an “S” in the seventh place of this ICD10 code indicates that this is a subsequent or sequalae visit and is not an acute firearm injury ED visit. By using a series of Underscores, the writers could specify what they didn’t want in the seventh place without specifying the middle part of the DD code that came before it.
The ICD10 codes in this example specify that the code must be preceded by a semicolon, forward slash, or space. Note the bracketed : this will capture multiple forms with a single statement. For example, the following ESSENCE free-text codes are equivalent, with the second being more concise:
^;W32^,or,^/W32^,or,^ W32^,or,^;W33^,or,^/W33^,or,^ W33^ = ^[;/ ]W3^
Try it out!
Run these two queries in a Discharge Diagnosis field. Notice how the results are the same:
^;W32^,or,^/W32^,or,^ W32^,or,^;W33^,or,^/W33^,or,^ W33^
Take a simple query and experiment by replacing letters with an Underscore or a bracketed list. You might find unexpected results when comparing ^Tick^ versus ^T_ck^ versus ^T[aeiou]ck^ or when comparing ^Heroin^ versus ^Her__n^ versus ^her[io][oi]n^
We thank Senior Data Analyst Zachary Stein for volunteering to write a series of articles about free-text coding. Stein, formerly with the Kansas Department of Environment and Health, does epidemiologic work to support NSSP efforts. Stein is an active participant in the NSSP CoP. He initially wrote about free-text coding as an entry on the NSSP CoP Syndrome Definition Committee forum. The forum generated considerable interest, inspiring this series. Stein acknowledges input provided by others who contributed to the forum post.