UNII to MeSH

The link from UNII to NDF-RT is very fragile, but because NDF-RT structure classes are just MeSH, why don't we use just MeSH directly?

Preconditions

UNII to MeSH

We have a pretty good mapping from UNII to CUI through the UMLS these days OriginalMeSH.

But we can still supplement with the name-based mapping:

INSERT INTO umls.UNII_MESH_MAP
SELECT DISTINCT u.code AS unii, d.code
  FROM umls.unii_name u
    INNER JOIN UMLS.MRCONSO c
      ON(LOWER(str) = LOWER(displayName) AND LAT='ENG')
    INNER JOIN UMLS.MRCONSO d ON(d.cui = c.cui AND d.sab = 'MSH')
  WHERE NOT EXISTS (SELECT 1 FROM umls.unii_mesh_map x WHERE x.unii = u.code)

1672 items added

with SELECT count(distinct unii) FROM unii_mesh_map 4304 distinct UNII codes (of 6482 UNII concepts). Previously the UNII_NDFRT_MAP only caught 857 or at most 1500, but that may be outdated.

SELECT * 
FROM umls.unii_concept u
LEFT OUTER JOIN
(SELECT unii, count(1) as count
  FROM umls.unii_mesh_map
  group by unii) x ON(x.unii = u.code)
INNER JOIN umls.unii_mesh_map um ON(um.unii = x.unii)
INNER JOIN umls.mesh_chemical mc ON(mc.code = um.code)
ORDER BY count DESC, displayname ASC

We used to have a classification of UNIIs which might be useful for allergy checking. Now our MeSH chemical RIM mapping is very large. Possibly because of mapped-to supplemental concepts. What if we removed those?

drop table umls.unii_mesh_class
go
create table umls.unii_mesh_class as
select distinct uc.code as unii_code, umm.code as mesh_code, mc.gencode as mesh_class
  from umls.unii_concept uc
    inner join umls.unii_mesh_map umm on(umm.unii = uc.code)
    inner join umls.mesh_chemical_isa_closure mc on(mc.speccode = umm.code)
  where (not mc.speccode like 'C%' or distance=1)
    and (not mc.speccode like 'D%' or distance=0)
    and mc.gencode like 'D%'
select count(distinct mesh_class), count(distinct mesh_code)
 from umls.unii_mesh_class

2041 vs. 3541

that's ~1500 saved. But really it's more because we have all of mesh.

select count(distinct gencode), count(distinct speccode) 
  from umls.mesh_chemical_isa_closure
    where gencode like 'D%'
      and speccode like 'C%'

6753 vs. 175329

that's a huge difference!

So let's just focus only on the MeSH classes that are D descriptors, not C supplemental chemicals. and use the unii_mesh_class for the mapping.

create index umls.unii_mesh_class_unii_idx on umls.unii_mesh_class(unii_code)

The XML export is in [source:db2xml/trunk/templates/mesh-rim.xml].