ALTO (XML)

ALTO (Analyzed Layout and Text Object) is an open XML Schema developed by the Library of Congress for OCR text and layout information. It is often used with Metadata Encoding and Transmission Standard (METS).

Structure

An ALTO file consists of three major sections as children of the root <alto> element:[1]

    <?xml version="1.0"?>
    <alto>
      <Description>
        <MeasurementUnit/>
        <sourceImageInformation/>
        <Processing/>
      </Description>
      <Styles>
        <TextStyle/>
        <ParagraphStyle/>
      </Styles>
      <Layout>
        <Page>
          <TopMargin/>
          <LeftMargin/>
          <RightMargin/>
          <BottomMargin/>
          <PrintSpace/>
        </Page>
      </Layout>
    </alto>

See also

External links

References

This article is issued from Wikipedia - version of the Thursday, November 26, 2015. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.