5

I have a Large XML file (600MB) and i want to convert that into CSV through Terminal Commands. I have tried to convert the XML into TXT through xml2 command using the following syntax.

xml2 < BIG.xml > BIG.txt

My XML Fromat is

<ReportDetails>
    <Date>08/08/2012</Date>
    <CaseNo>13030903</CaseNo>
    <UserDetailsText>Individual Details</UserDetailsText>
    <UserDetails>
        <UserId>0903</UserId>    
        <FirstName>John</FirstName>
        <Surname>Perry</Surname>
        <Occupation>Developer</Occupation>
        <DateofBirth>02/14/1981</DateofBirth>    
    </UserDetails>
    <ApplicationDetailsText>Conflict Resolution Details</ApplicationDetailsText>
    <ApplicationDetails>
        <ApplicationNo>13033</ApplicationNo>
        <ApplicationName>John Perry</ApplicationName>
        <Department>Information Technology</Department>
        <ApplicationType>Personal</ApplicationType>
        <ApplicationDate>06/07/2012</ApplicationDate>
        <ApplicationEndDate>09/07/2012</ApplicationEndDate>
        <ApplicationStatus>Closed</ApplicationStatus>    
     </ApplicationDetails>  
</ReportDetails>

I want these fields in CSV file separated with a Pipe (|)

Date | CaseNo | FirstName | Surname | ApplicationNo | ApplicationName | ApplicationDate | ApplicationStatus

Also if i want to do the file through a PHP file will i need a shell script to perform that.

HardCode
  • 163

1 Answers1

7

Use XSL to perform the transformation to the exact format you need; e.g.,

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="iso-8859-1"/>
<xsl:strip-space elements="*" />

<xsl:template match="/ReportDetails">
  <xsl:value-of select="Date"/><xsl:text>|</xsl:text>
  <xsl:value-of select="CaseNo"/><xsl:text>|</xsl:text>
  <xsl:apply-templates select="UserDetails" />
</xsl:template>

<xsl:template match="/ReportDetails/UserDetails">
  <xsl:value-of select="FirstName"/><xsl:text>|</xsl:text>
  <xsl:value-of select="Surname"/><xsl:text>|</xsl:text>
</xsl:template>

<!-- etc -->
</xsl:stylesheet>

And then given the above (e.g., in foo.xsl to transform original xml document foo.xml):

$ xsltproc  foo.xsl  foo.xml 
08/08/2012|13030903|John|Perry|

(The devil is in the details of the xsl... there are numerous ways to implement this...)

michael
  • 2,109