Encoding Scheme

The following summarizes the TEI specification with which the Digital Temple transcriptions are encoded for the purposes of display and search-and-retrieval operations.

Structure

Poems in individual XML files are encoded according to the TEI recommendations for parallel segmentation—all versions of each poem on a line-by-line basis:

<lg n="4" type="quatrain">
    <l n="1">
        <app>
            <rdg wit="#w">This is line 1 from a poem in Williams MS.
                Jones B62.</rdg>
            <rdg wit="#b">This is the same line as witnessed by Bodleian
                MS. Tanner 307.</rdg>
            <rdg wit="#ed1">And this is the line as witnessed by <hi rend="italics">The Temple.</hi></rdg>
        </app>
    </l>
    <l n="2">
        <app>
            <rdg wit="#w">This is line 2 from a poem in Williams MS.
                Jones B62.</rdg>
            <rdg wit="#b">This is the same line as witnessed by Bodleian
                MS. Tanner 307.</rdg>
            <rdg wit="#ed1">And this is the line as witnessed by <hi rend="italics">The Temple.</hi></rdg>
        </app>
    </l>
    <!-- Lines 3 and 4 follow here. -->
</lg>

Line groups (lg) carry the @type attribute, the content of which is determined by meter and/or rhyme scheme, patterns that mostly but not always correspond to stanza divisions in the page layout of the source. For example, there are no white spaces between the quatrains or third quatrain and final couplet in the manuscript versions of most of Herbert's sonnets; but the transcriptions, both in the code base and in the display, reflect those structural divisions.

Notes

Light critical commentary, glosses, and textual notes are embedded within the encoded transcriptions using the note element. No formal distinction is made among different types of note. Where a note pertains to all witnesses, its immediate parent is the app (i.e., apparatus) element:

<l n="47">
    <app>
        <rdg wit="#w">Content of poem line.</rdg>
        <rdg wit="#b">Content of poem line.</rdg>
        <rdg wit="#ed1">Content of poem line.</rdg>
        <note>This note content pertains to all three witnesses.</note>
    </app>
</l>

Where it pertains to only one version, its parent is the rdg (reading) element of the appropriate witness:

<l n="47">
    <app>
        <rdg wit="#w">Content of poem line.<note>This note content pertains
            only to the Williams manuscript.</note></rdg>
        <rdg wit="#b"/>
        <rdg wit="#ed1"/>
    </app>
</l>

Where the note pertains to two of three witnesses, it is contained within a separate rdg element that includes no other content. For example, a note applicable to the Williams manuscript (rdg wit="#w"/) and first edition of The Temple (rdg wit="#ed1"/) but not the Bodleian manuscript (rdg wit="#b"/) is included as follows:

<l n="47">
    <app>
        <rdg wit="#w">Content of poem line.</rdg>
        <rdg wit="#b">Content of poem line.</rdg>
        <rdg wit="#ed1">Content of poem line.</rdg>
        <rdg wit="#w #ed1">
            <note>This note content pertains to witnesses #w and #ed1
                only.</note>
        </rdg>
    </app>
</l>

Page Breaks and Forme-Work Features

All page breaks and forme-work features (running titles, page and/or folio numbers, catchwords, and signatures) are tagged using milestone-type elements—i.e., elements outside of the hierarchical flow of text. In the following hypothetical example, a page break occurs before line 47 in the Williams manuscript only (and is accompanied by a running title, "The church"). The same line is the final line on a page in the first edition only and is followed there by a signature, catchword, and page break (itself accompanied by the forme-work features that occur in that witness), then by the next line in the poem. Thus, lines 47-48—with the milestone elements marking page breaks and forme-work features where they occur in their respective witnesses—look like this:

<pb ed="#w" facs="#w42v"/>
<fw facs="#w42v" type="header" place="margin-top">The church.</fw>

<l n="47">
    <app>
        <rdg wit="#w">Williams line content.</rdg>
        <rdg wit="#b">Bodleian line content.</rdg>
        <rdg wit="#ed1">First edition line content.</rdg>
    </app>
</l>

<fw facs="#ed143" type="sig" place="bottom">B4</fw>
<fw facs="#ed143" type="catch" place="bottom">The</fw>
<pb ed="#ed1" facs="#ed144"/>

<fw facs="#ed144" type="pageNum" place="margin-topleft">44</fw>
<fw facs="#ed144" type="header" place="margin-top">
    <hi rend="italic">The Church-porch.</hi>
</fw>

<l n="48">
    <app>
        <rdg wit="#w">Williams line content.</rdg>
        <rdg wit="#b">Bodleian line content.</rdg>
        <rdg wit="#ed1">First edition line content.</rdg>
    </app>
</l>

Aspects of Layout

Certain aspects of layout are reflected in the transcriptions only approximately, and others are not included at all. Line indentation patterns, for example, would require tedious encoding to be reproduced accurately. The editors have opted instead to include the @rend attribute on the div element of each poem to indicate one of three rendering possibilities for line justification: left (the default), right, or center, depending on which rendering most closely approximates the layout of the source. The Versioning Machine stylesheets read this attribute and display the poems accordingly.

We do not include in the markup other visual elements, such as the diagonal slashes separating stanzas in some poems of the Williams manuscript, or instances in the first edition where the final word or syllable of a line appears in parenthesis at the end of the following line. (The l tag is defined in the TEI Guidelines as designating a line of verse—which, to us, means having a metrical structure—not a line of text which does not have a metrical structure and which happens to end at a certain point on the page. We could have included lb, i.e., line-break, tags to capture this visual aspect of the text, but decided that this would be both confusing and unnecessary.) The inclusion of high-resolution images of the sources obviates the need for such detail—even though the editors have aimed to reproduce in the transcriptions the original spelling and orthography. The primary purpose of the transcriptions is to capture for the purpose of electronic retrieval the poems' intellectual content and aural dynamics; their status as visual phenomena is presumed to be secondary—except in the case of such distinctive shape-poems as "The Altar" and "Easter Wings." Even there, however, technical limitations of the interface prevent more than an approximate representation in the transcriptions of the poems' actual appearance in the sources.

Content Encoding Semantics

The transcriptions are both diplomatic and modern. That is, they preserve original spellings and orthography as well as abbreviations and some aspects of appearance such as italics and superscript characters, while also providing in the code base both modern spellings/orthography and expanded abbreviations.

Words whose original spelling/orthography deviates from the modern as recorded by the Oxford English Dictionary are treated using the TEI elements w (word), choice, orig, and reg as follows:

<w lemma="say">
    <choice>
        <orig>ſayd</orig>
        <reg>said</reg>
    </choice>
</w>

The @lemma attribute on element w is the OED headword—which would be "said" if the word were an adjective rather than, as in this case, the past-tense form of the verb "say." (Though possible to do so, we have not included a @type attribute on w to indicate the word's part-of-speech function as determined by context.) The

<choice>
    <orig/>
    <reg/>
</choice>

configuration allows the Versioning Machine stylesheets to "choose" to display the original spelling (with that initial Latin small long-s) in the surface transcription and the modern or regularized spelling in a tooltip-enabled mini-window (see Versioning Machine Instructions for more on this).

Abbreviations and elisions are handled similarly:

<w lemma="with">
    <choice>
        <orig>w<hi rend="superscript">th</hi></orig>
        <reg>with</reg>
    </choice>
</w>

<w lemma="every">
    <choice>
        <orig>eu'ry</orig>
        <reg>ev'ry</reg>
    </choice>
</w>

Notice in the first example the repetition of "with" as both the OED lemma and the regularized form. The latter is included to facilitate Versioning Machine rendering as described above, the former to maintain consistency. In the second example, however, the three forms are necessary—the lemma in this case capturing the actual dictionary headword, thus ensuring that searches for "every" will retrieve every instance of that word, however spelled or elided.

This scheme includes the ubiquitous "then" for "than" and abbreviated forms of "the," "than," and "them"—ye, yn, ym—etc.:

<w lemma="than">
    <choice>
        <orig>then</orig>
        <reg>than</reg>
    </choice>
</w>

<w lemma="them">
    <choice>
        <orig>y<hi rend="superscript">m</hi></orig>
        <reg>them</reg>
    </choice>
</w>

Deletions, Additions, Corrections

The TEI Guidelines distinguish between apparent corrections/changes in the original source and those introduced by the transcriber/editor. Scribal emendations in the Williams and Bodleian manuscripts are thus handled using the deletion (del) and addition (add) elements. For example, in line 135 of "The Church Porch" as witnessed by Williams MS. Jones B62, the word "common-wealths" appears originally to have been written as "common-wellths" before being emended by writing an "a" over the first "l." The encoding marks this emendation as follows:

<w lemma="commonwealth">
    <choice>
        <orig>common-we<del rend="overwrite">l</del><add rend="overwrite">a</add>lths</orig>
        <reg>commonwealths</reg>
    </choice>
</w>

The editors here make no judgment regarding correctness. We simply observe the emendation as a scribal phenomenon and mark it accordingly. However, in line 36 of the same poem as witnessed by Bodleian MS. Tanner 307, the word "wordly" seems clearly to have been an unnoticed scribal error for "worldly." The editors thus intervene using the TEI elements sic and corr:

<w lemma="worldly">
    <choice>
        <sic>wordly</sic>
        <corr>worldly</corr>
    </choice>
</w>

Like original spelling/orthography, apparent scribal error is preserved in the diplomatic transcription, while the editorially supplied correction—like modernized spelling/orthography—is accessible through the Versioning Machine's tooltip mechanism mentioned above.

Rhyme and Metrical Analysis

All poems have been tagged for metrical patterning and rhyme using the TEI element attributes @met, @real, and @rhyme. Our intention, in keeping with TEI protocol, has been to capture the conventional structure within which the poet appears to be working (the @met attribute), as well as the actual prosodic realization (the @real attribute). Note too that the marking of rhyme and meter pertains only to the #ed1 witness (i.e., the 1633 Temple) and not the manuscripts (except where a poem is found only in #w, the Williams manuscript). In the simplest cases (i.e., where the same pattern is repeated from stanza to stanza), these attributes are invoked on the poem's topmost element, the div tag:

1	<div type="poem" xml:id="poemTitle" met="pentameter/" rhyme="abab"/>

The poem contained within this division consists of nothing but quatrains in alternating rhyme, all lines in iambic pentameter, with the understanding that the content of the @met attribute pertains by default to each successive line (l) and the content of the @rhyme attribute to each successive line group (lg). If the conventional meter were to vary from line to line within stanzas, but according to a pattern consistent across all stanzas, the @met attribute on the div element would reflect this:

1	<div type="poem" xml:id="poemTitle" met="pentameter/dimeter/tetrameter/trimeter/" rhyme="abab"/>

Just as the alternating rhyme scheme is understood to apply to all stanzas, so too are all stanzas understood to repeat this pattern of four metrically distinct lines.

For poems with varying stanzaic forms (the sonnets, for example), the @rhyme attribute is invoked on the lg element. Note that there is no attempt to capture actual rhyme words or syllables: the abab content of the @rhyme attribute in the example above pertains only to the alternation common to all stanzas, each of which could (and is likely to) have rhyme sounds different from its sister stanzas. The TEI does provide a method for capturing such detail—the rhyme element in combination with the @label attribute applied to the actual rhyming words, thus enabling the identification of inter-stanzaic rhymes and, potentially, additional patterns—but it has not been employed in the present edition. (See the TEI recommendations for rhyme and metrical analysis for more on this.)

The editors, however, have made an exception for Herbert's sonnets, capturing with the @rhyme attribute alone both the individual stanzas' rhyme patterns (for example, three quatrains with alternating rhymes, and a couplet) and the repetition of terminal rhymes across line groups. Marking, for example, the more complex whole of Herbert's "The Answer," a sonnet with the unconventional rhyme pattern abab cdcd efef ee, means placing @rhyme attributes on the individual line group elements (lg) rather than on the higher-level div element:

<div type="poem" xml:id="answer" met="pentameter/">
    <lg n="1" type="quatrain" rhyme="abab">
        <!-- Lines 1-4 --></lg>
    <lg n="2" type="quatrain" rhyme="cdcd">
        <!-- Lines 5-8 --></lg>
    <lg n="3" type="quatrain" rhyme="efef">
        <!-- Lines 9-12 --></lg>
    <lg n="4" type="couplet" rhyme="ee">
        <!-- Lines 13-14 --></lg>
</div>

This implies that lines 9, 11, 13, and 14 share the same terminal rhyme. However, while this method is applied to all sonnets in the edition, the same cannot be said for all poems. To capture the shared rhymes of a long poem such as "The Sacrifice"—where the rhyme scheme, aaab, is identical in all stanzas but the terminal rhymes either differ from stanza to stanza (eg., cccb or dddb) or are repeated in some stanzas according to no apparent pattern—would be cumbersome and laborious, not to mention tag-abusive. So the @rhyme attribute on that poem, and others similarly resistant to simple analysis, should be understood to capture only the pattern consistent from stanza to stanza and not any inter-stanzaic rhymes.

The same is true of poems with stanzas of varying metrical structure. "The Church Floor," for example, consists of four tercets and an octave, each stanza type having its own distinct metrical pattern:

<lg n="4" type="tercet" met="pentameter/trimeter/dimeter/" rhyme="abc">
    <l n="10">
        <app>
            <rdg wit="#b">But the sweet cement [. . .]</rdg>
            <rdg wit="#ed1"><!-- #ed1 content --></rdg>
        </app>
    </l>
    <!-- Lines 11-12. -->
</lg>

<lg n="5" type="octave" met="tetrameter/tetrameter/pentameter/tetrameter/tetrameter/pentameter/tetrameter/tetrameter/" rhyme="aabccbdd">
    <l n="13">
        <app>
            <rdg wit="#b">Hither sometimes [. . .]</rdg>
            <rdg wit="#ed1"><!-- #ed1 content --></rdg>
        </app>
    </l>
    <!-- Lines 14-20. -->
</lg>

As stated above, the @met attribute pertains to the conventional meter within which the poet is working. Such tagging is relatively straightforward. The @real attribute, however, is invoked to record the actual prosodic pattern of a given line, its "natural" speech rhythm, and is therefore a far more interpretive markup. Unlike the highly repetitive nature of the conventional meter captured using the @met attribute, actual prosodic realization from line to line is highly idiosyncratic, therefore requiring that such tagging be performed on a line-by-line basis. The symbols used here to include this more meticulous encoding are stipulated in the teiHeader element, as follows:

<metDecl pattern="+-|/" xml:id="rootSymbols">
    <metSym value="+">accented syllable</metSym>
    <metSym value="-">non-accented syllable</metSym>
    <metSym value="|">foot division</metSym>
    <metSym value="/">line division</metSym>
</metDecl>

These elements describe the values of root symbols invoked elsewhere in the edition's metrical analysis apparatus.

The following elements describe values pertaining to the conventional meters within which the poet is working, handled in the markup by using the @met attribute on the highest-level text-division element possible, depending on the complexity of metrical patterning in a given poem:

<metDecl xml:id="convMetSymbols" corresp="#rootSymbols">
    <metSym value="monometer" terminal="false">-+/</metSym>
    <metSym value="dimeter" terminal="false">-+|-+/</metSym>
    <metSym value="trimeter" terminal="false">-+|-+|-+/</metSym>
    <metSym value="tetrameter" terminal="false">-+|-+|-+|-+/</metSym>
    <metSym value="pentameter" terminal="false">-+|-+|-+|-+|-+/</metSym>
    <metSym value="hexameter" terminal="false">-+|-+|-+|-+|-+|-+/</metSym>
    <metSym value="heptameter" terminal="false">-+|-+|-+|-+|-+|-+|-+/</metSym>
</metDecl>

The elements' contents invoke the root-level symbols described above. (The @terminal attribute being set to "false" indicates that the content of the metSym element is to be understood in light of some other metDecl element—namely, the one marked by the @xml:id attribute whose content is "rootSymbols.")

The elements below, finally, describe values pertaining to actual accentual patterns, handled in the markup by using the @real attribute (usually on the l element, but possibly on the parent lg element). Again, the elements' contents invoke the root-level symbols described above:

<metDecl xml:id="realSymbols" corresp="#rootSymbols">
    <metSym value="I" xml:id="iamb" terminal="false">-+</metSym>
    <metSym value="T" xml:id="trochee" terminal="false">+-</metSym>
    <metSym value="D" xml:id="dactyl" terminal="false">+--</metSym>
    <metSym value="A" xml:id="anapest" terminal="false">--+</metSym>
    <metSym value="AMP" xml:id="amphibrach" terminal="false">-+-</metSym>
    <metSym value="S" xml:id="spondee" terminal="false">++</metSym>
    <metSym value="P" xml:id="pyrrhic" terminal="false">--</metSym>
</metDecl>

The following example illustrates our deployment of these symbols in the edition's markup:

<div type="poem" xml:id="flowerB_ED1" met="tetrameter/pentameter/tetrameter/pentameter/dimeter/dimeter/tetrameter/" rhyme="ababccb">
    <lg type="septet" n="1">
        <l n="1">How freſh, O Lord, how ſweet and clean</l>
        <l n="2" real="I|I|D|T|I/">Are thy returns! ev'n as the flowers in
            spring;</l>
        <!-- Remaining lines 3-7 of this poem's first stanza. -->
    </lg>
</div>

Because the poem's seven stanzas all share the same rhyme scheme and (conventional) metrical pattern, this information is stated only once, on the top-level div element. Where a line (i.e., l) element does not include a @real attribute, its actual prosodic realization is understood by default to be identical with the line's conventional meter as indicated by the @met attribute on the top-level div element. The actual accentual pattern of line 1, therefore, corresponds exactly to the pattern symbolized by the character string "tetrameter," a value corresponding to the alternating pattern of accented and unaccented syllables indicated by the content of its metSym element—namely, -+|-+|-+|-+/. Line 2, however, because departing from the conventional pentameter pattern indicated by the @met attribute on the top-level div element, is marked by a @real attribute indicating its actual accentual pattern—namely, I|I|D|T|I/. Note that these latter symbols/values, in light of their correspondence with the root symbols as declared on the teiHeader, can be further construed as -+|-+|+--|+-|-+/.

Because all versions of all poems are encoded in parallel, line-by-line, more complicated encoding is required for poems whose stanza divisions differ from version to version. The #b and #ed1 versions of "Evensong," for example, consist respectively of octaves and quatrains, so that lines from both do not nest neatly within one and the same lg element. It is possible in such cases to apply metrical- and rhyme-analysis tags to the discrete-witness XML files, but such encoding is not provided here.