Developing hospital identifiers in Medicaid TAF data
Part 2: Using the NPI CCN Crosswalk developed on Medicare Advantage claims
The Centers for Medicare and Medicaid Services (CMS) has released a national dataset of de-identified Medicaid claims available to researchers; the Transformed Medicaid Statistical Information System (T-MSIS) Analytic Files (TAF). This dataset brings together claims from various states into a single resource, making it easier for analysts to study healthcare trends and outcomes across the country. While promising, there are data quality issues that analysts must navigate.
We at the Lown Institute are aiming to develop claims-based measures using TAF. Since there are few resources on working with this data, we’re excited to share our findings and developed approaches through this blog series. Please comment below if you have any ideas, questions, or if you are working on something similar and found this helpful!
In our previous post, we analyzed whether we could create a CMS Certification Number (CCN) to submitting state ID crosswalk using the TAF Annual Provider (APR) file. Here, we wanted to see if we could reliably use another approach which we have used in our other work on Medicare Advantage (MA) encounter data and Fee-For-Service (FFS) claims. The MA encounter data uses NPIs (and not CCNs), and so we developed an NPI to CCN crosswalk based on the information in the FFS claims. The CMS Office of Enterprise Data and Analytics released a report last year detailing their very similar approach.
CCN to NPI using Medicare FFS claims
We selected the prvdr_num (CCN), org_npi_num (organizational NPI) from 2020 FFS inpatient and outpatient claims This does mean we are limited to NPIs and CCNs which filed claims to traditional Medicare. Our universe of possible CCNs (as described in the previous post) is the 2020 release of CMS Care Compare where the hospital type is either Acute Care or Critical Access. Of these 4,611 CCNs, we found 4,583 (99.4%) in the FFS claims with an NPI match (5,259 total NPIs).
There were also 3 NPIs which also mapped to more than one CCN. One of these, ‘9999999996’, was used for 6 CCNs and is likely a reporting filler value, so we dropped this NPI. Luckily this did not impact the number of CCNs included in the preliminary crosswalk. Of the remaining two NPIs we did a manual check using the NPI Registry to assign the NPIs to a single CCN.
Edit 5/15: We also did the same steps on 2020 MedPAR data, and unioned this table with the Medicare FFS claims data. This resulted in 4,585 CCNs with 8,227 total NPIs.
NPI in TAF data
Next, we looked at the billing_prvdr_npi and billing_prvdr_id in the 2020 Medicaid TAF inpatient files and selected distinct pairs. There were 9,432 NPIs and 23,339 billing provider IDs.1 We filtered the NPI taxonomies to only include hospitals, resulting in 7,647 NPIs.
Joining this to our CCN NPI crosswalk, we found a match for 4,638 NPIs and therefore 4,477 CCNs (97.1% of our original in CMS Compare universe).
Edit 5/15: With our updated Medicare FFS + MedPAR crosswalk, this was 5,903 NPIs and 4,482 CCNs (97.2%).
What about the unmatched NPIs in the TAF data?
For our CCN to NPI mapping, we only included those pairs which were used in a Medicare Fee-For-Service claim. A single institution may have multiple NPIs, and it is possible that these exist in the inpatient TAF claims but not the Medicare FFS claims. There were 5,717 unmatched NPIs from the TAF inpatient data, and 540 of these did have an NPI taxonomy of General Acute Care Hospital.
Edit 5/15: This is now 3,720 unmatched NPIs, and 510 with an NPI taxonomy of General Acute Care Hospital.
What does this mean?
We could use the current CCN to NPI mapping, with the inpatient volume of the unmatched NPIs as a ‘known unknown’. There may be methods to add the additional NPIs from the TAF data, such as linking NPIs if they both have the same TAF state ID, or supplementing the CCN NPI crosswalk with the (somewhat outdated) 2017 public version released by NBER. Additional state IDs or NPIs could also be added by using information from the TAF Annual Provider base file, such as provider legal name and address – although these are not consistent and would require cleaning and text analysis to find matches.
The NPI Registry data does record (although I’m unsure how consistently) Medicaid state IDs associated with the NPI. This could be another source to link NPIs to state IDs, and then potentially to our CCN map if there is overlap in state IDs to multiple NPIs.
Not all TAF data analysis will require a CCN, and submitting state ID or NPI may be sufficient. Our goal is to link, however, this to CMS Hospital Cost Reports (which uses CCN) – so the work continues!
1 Note – we pair billing provider ID with the submitting state, since unique providers are identified with both the submitting state and the state ID.