Wednesday 25 June 2014

Find Duplicate IDs with XSLT

This little snippet of XSLT is a useful tool to find all duplicate ids within an XML source document and generate a report with the count of duplicates and xpath to each element that has a duplciate id attribute.

This relies upon the attribute in question being names @id but it is simple enough to change this to whatever attribute you need to interrogate

Note that this is XSLT 2 and has been used with the Saxon transformation engine


<xsl:stylesheet version="2.0" 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
 
  exclude-result-prefixes="xs">

  <xsl:output indent="yes"/>
  
  <xsl:key name="ids" match="*[@id]" use="@id"/> 

  <xsl:template match="/">
    <duplicates>
      <xsl:apply-templates select="//*[@id]"/>
    </duplicates>
  </xsl:template>


  <xsl:template match="*[@id]">
    <xsl:if test="count(key('ids', @id)) &gt; 1">
      <duplicate 
        id="{@id}" 
        dup-count="{count(key('ids', @id))}" 
        node-xpath="{string-join((for $node in ancestor::* return concat($node/name(),'[', count($node/preceding-sibling::*[name() = $node/name()])+1, ']'),concat(name(),'[', count(preceding-sibling::*[name() = current()/name()]) + 1, ']')
   ),'/')}">
     
      </duplicate>
    </xsl:if>
  </xsl:template>

</xsl:stylesheet>

No comments:

Post a Comment