CHAPTER 2. TRANSCRIBING CONVENTIONS This chapter presents a set of basic symbols for transcribing spoken discourse, along with comments on how to use them. For each symbol or convention, examples are gresented1 drawn from transcriptions of natural conversations. In some cases, we comment on such issues as why the phenomenon in question should be attended to. where appropriate, we also comment on relevant details of orthographic convention or style such as the placement of spaces. (Unless otherwise noted, the transcription symbols presented below are always to be preceded by a single space and followed by a single space; that is, they are to be separated from surrounding words, and other material, by one space.) 2.1 Pause and prosody. The placement and timing of pauses in spoken discourse conveys significant information about the speaker9s discourse production process and orientation toward the ongoing conversational interaction. Pauses should be indicated explicitly using one of the following three notations. Since the intonational symbols (e.g. comma and single period, §2.2) do not of themselves denote pause, any pause (even a slight one) that occurs in conjunction with an intonation contour must be specifically indicated using one of the pause notations. By convention, a pause between two intonation units is written together with the unit that follows it rather than with the one that precedes it. (1) ...(.n) long (timed) pause This indicates a pause of about 0.7 seconds or longer, for which the approximate duration is indicated, in parentheses to the nearest tenth of a second (as determined roughly with a stopwatch).9 (That is, the duration is indicated as (.7), (.8), (1.2), etc.) (A space precedes the initial period, and a space follows the right parenthesis, but no space appears internal to this character string.) (la)
D: ... (3.0) I 'had them 'done at ^pips. ... (l.0) You ^see it, ((TRN_CARS)) (1b)
R: ... (HH) 'We 'start 'out ... (.8) with ...(.8) 'dead ^horse hooves. ((TRN_RANCH)) 10 (1c)
R: ... ^This .. is a 'type of 'person, ...(.9) 'that ...(.7) is 'like ...(l.0) a 'hermit. ((TRN_RANCH)) (1d)
J: .. when 'I think of ^a=ds, .. I 'think of | ...(l.2) 'aesthetics. ((TRN_AESTH)) (1e)
A: ... ^down at the= uh -- ... (1.2) ^reading the 'gau=ge, (TRN_FARM)) In some cases, the questions of whether a pause has occurred in a specific place, how long it lasts, and whose pause it is, become subtly and inextricably linked to the interpretation of turn-taking and overlapping between speakers (§3.2, step 15). (2) ... medium pause 'this indicates a pause which is noticeable, but not very long -- about half a second in duration (0.3 - 0.6 seconds). (A space precedes and follows the string of three periods, but no spaces appear between them.) (2a)
J: m=hm. S: ... 'That's what .. the ^poet is 'after, (TRN_AESTH) (2b)
S: .. (HH) 'U=m, ... That's ^o=ne 'kind of thing, ((TRN_AESTH)) (2c)
G: ...(1.7) I'd 'like to 'have .. my% ... ^lungs, ... my ^entire respiratory 'tract, ... (HH) ^replaced, ... (HH) with .. 'asbestos. .. or 'something. ((TRN_HYPO)) (3) .. very short pause; tempo lag This indicates a brief break in speech rhythm: that is, a very short, barely perceptible, pause (about 0.2 seconds or less), or a lag in tempo. The best way to determine whether the two-dot symbol is called for is to imagine a metronome ticking at 11 the same rate as the speaker is currently producing syllables- A word which lags behind the speaker's rate of syllable production (or lags behind one's mental metronome ticks) exhibits the tempo lag, and should be preceded by two dots. It should be noted that not all instances of the two-dot symbol will correspond to an actual silence, nor are all brief silences to be marked: the moment of silence which necessarily occurs during a lexically or phonologically required glottal stop (or other voiceless stop) is not to be written with two dots. The reason for this is that we are interested in the pause as a functional cue to aspects of discourse production and conversational interaction, not as a raw acoustic fact. (A space precedes and follows the string of two periods, but no space appears between them.) (3a)
R: ... And 'then, .. they ^videotape us, .. 'as we ^go. ((TRN_RANCH)) (3b)
R: .. a ^reining pattern is, .. a ^pattern where you= .. do sliding 'sto=ps, .. spi=ns, ... ^lead changes, .. I ^know you 'probably don't 'know what that 'is. ((TRN_RANCH)) (3c)
D: .. I mean, 'I have the 'opportunity, to ^talk to people, .. to ^get the 'phone book, ((TRN_CARS)) (3d)
B: ... 'She just .. pulled the 'cat | .. and the 'kittens ^out, .. and 'pulled off the ^bread that was 'dirty and, ... we ^served the 'rest of it. ((TRN_DEPR)) (3e)
J: .. I mean, there are ^people that ar=e .. just 'hard to .. ^sell to. S: .. mhm, J: ... and 'hard to ^advertise to. ((TRN_AESTH)) 12 (4) = lengthened segment This indicates that a syllable or segment is lengthened prosodically (to a degree greater than what is expected on the basis at lexical stress patterns). The slight lengthening which is to be expected when a syllable is stressed is not marked with the equals sign (being implicit in the stress marking). similarly, segments which are phonemically long (in a language with a length contrast for vowels or consonants) do not receive the equals sign notation: they should be written with a different symbol (e.g. colon or doubled letters). Prosodic lengthening is often heard in the final syllable of the intonation unit (especially if the word bears nuclear stress). (The equals sign is written immediately following the lengthened segment; no spaces separate it from the letters of the word it appears in. For phonemes that are represented in standard orthography by a digraph (e.g. in English, ee, ea, oo, ph, ch, tt, etc.), the convention is that the equals sign is written after both letters of the digraph.) (4a)
K: ... (.7) ^Glen's never had a% .. a ^co=ld, .. or the ^flu=, ((TRN_HYPO)) (4b)
A: ... The 'thing ^about him 'i=s, .. he 'ca=n't ^spe=ll. ((TRN_FARM)) (4c)
A: .. and I decide I'm going to get a ^ne=w door, .. and a ^ne=w 'jamb. ((TRN_DOOR)) (4d)
N: .. (HH) she was ^f=rantically | .. ^running 'arou=nd, like 'trying to get ^away from him. ((TRN_J&J)) (5) N primary stress This symbol indicates a word which bears a primary stress. The stressed word is immediately preceded by a double quote mark, with no space between it and the ward. In English and many other languages, the particular syllable on which this stress is realized is lexically predictable, and thus need not be indicated in a discourse-level transcription. (For the occasional utterance of a word token in which stress is realized on a syllable other than the normal one, this fact can be captured by using the notation provided for phonetic transcription (§2.4).) 13 ( 5a)
B: .. ^I met 'him, and I 'thought he was a 'ni=ce ^kid. S: .. He ^is a nice 'kid. but he's ^wei=rd. (TRN_FARM)) (5b)
B: .. I ^never 'met the guy=. ((TRN_FARM)) (5c)
J: 'This is one of the things I've ^thought about, a ^lot. S: (0) 'Yeah. (TRN_AESTH)) (6) ' [grave accent] secondary stress The grave accent character11 indicates a word which bears secondary stress (relative to nearby stressed and unstressed words). The grave accent immediately precedes the word in question, with no space between it and the word. (6a)
J: ... 'You know, 'that's just a 'fact about that ^thing. ((TRN_AESTH)) (6b)
G: ... (2.2) 'a=nd, of course, a 'lot of herb 'tea, when I'd 'rather be drinking ^whiskey. ((TRN_HYPO)) 14 (6c)
R: ...You know, ^I had been 'practicing this |.. with my ^horse, .. for a 'lo=ng ^time. but ^never when anybody was around. ((TRN_RANCH)) Because it can be difficult to distinguish reliably between three degrees of stress -- primary stress, secondary stress, and (implicitly) non-stress -- some researchers may prefer to mark only two degrees of stress, corresponding to stress (to be written with grave accent) and non-stress (unmarked). 2.2 Intonation contour. The five symbols in this section are used for very partially representing the shape of intonation contours, using for the most part the available symbols of written punctuation. We are not particularly satisfied with these categories and notations far intonation, but we can make do with them as long as it is realized that the punctuation symbols are to be used and interpreted intonationaily, and not grammatically or semantically. For researchers who wish to invest the considerable effort required to do justice to intonation in discourse, the work of the London and Lund researchers (crystal 1975, Svartvik and Quirk 1979, etc.), Gumperz (1982), and others should be consulted (see Cruttenden 1986 far additional references) * (For the notion of intonation unit, see Chafe (1979, 1960b, 1967, forthcoming); for a discussion of point-by-point vs. unit summary systems for intonation transcription, see Du Bois (forthcoming a).) (The intonation contour symbols in this section are written at the end of the line they appear in.) (7) . (period) final pitch contour The period is used to indicate a pitch contour which is understood as final in a given language. For English and many other languages, this means primarily (but not exclusively) a rail in pitch at the end of an intonation unit It is important to recall that, since this symbol has intonational rather than syntactic meaning, it can appear in places other than the end of a sentence. conversely, it need not appear at the end of every (normative) sentence. (7a)
J: ...(l.5) You're 'not ^say=ing something, you're ^doing something to people. ((TRN_AESTH)) 15 (7b)
A: You 'don't ^see them very often. ((TRN_AFR)) (7c)
R: .. For 'what. B: ... They 'make ^rope of it. ((TRN_DEPR)) (7d)
A: 'We don't have a ... kind of | ..'vehicle to ^tra=nsport these things. ((TRN_DOOR)) (8) , [comma] continuing pitch contour The comma is used to indicate a pitch contour which is understood as continuing in a given language. In practice this is a loose cover symbol for a variety of nonfinal contours that are neither period intonation nor question intonation. The contour is often realized in English as a level or slight rise in pitch at the end of an intonation unit, but it has other realizations also, each of which no doubt has slightly different pragmatic implications. (A perspicacious and efficient means of distinguishing among the many contours subsumed under this symbol would be a valuable contribution to discourse studies.) (8a)
R: If you 'think about it, 'yeah, if it 'rains a lot, .. the 'horse is always 'we=t, .. and it's always 'moi=st, .. it's always on something 'moi=st, ... ^Sure it's going to be 'softer. ((TRN_RANCH)) (8b)
D: .. I have my ^own 'telephone, my ^brie=fca=se, I can 'work on ^cli=ents, all the 'time, .. (HH).. 'You know, ^call them on the 'pho=ne, .. and uh=, ... 'take a ^lunch, ((TRN_CAR)) 16 (8c)
J: .. (HH) And I looked 'over, ... ^into the 'street, and saw this ^cop car, 'going along, .. ^right ... 'next to me, you 'know, like .. 'five miles an ^hou=r. ((TRN-J&J)) (9) ? rising question contour This indicates a marked rise in pitch at the end of an intonation unit, as is characteristic of a polar (yes-no) question. It is not used for a grammatical question uttered with declarative intonation. conversely, it may appear at the end of units which do not have the morphosyntactic structure of a (normative) question. (9a)
B: ... But .. ^were they 'rattle snakes ? ((TRN_DEPR)) (9b)
B: .. 'she never 'raised ^hemp? ((TRN_DEPR)) (9c)
D: I 'ordered a ^thou=sand 'business cards. G: Yeah? ... You 'get them 'printed ^here ? ((TRN_CARS)) (9d)
A: .. And we were 'ma=d, ... because 'Gladys had told us we 'had to be 'back by ^Monday, .. even though 'Monday was a ^holiday ? .. ^Remember that ? ((TRN_APFRICA)) (9e)
J: < Q ... 'Should we ^waste him ? or should we ^stop him, and ... ^then 'waste him. Q> ((TRN_J&J)) 17 (10) ! exclamatory intonation The exclamation point marks an intonation contour that is understood as exclamatory. It is typically realized as increased pitch range and sudden pitch movement, and sometimes increased loudness. (10a)
S: ^Bo=y was that 'goo=d! ((TRN_AESTH)) (10b)
D: ...(.9) 'No 'basketball. G: ...(1.0) ^Really! ((TRN_CARS)) (l0c)
M: .. 'You're ^kidding! S: (0) 'Yeah. ((TRN_FARM)) (l0d)
S: ... A 'lot of it's really ^ba=d! ((TRN_AESTH)) (10e)
A: ... 'That guy makes 'ZZZ look ^kick-ba=ck. B: ...(1.0) ^Wha=t! ((TRN_FARM)) (l0f)
B: ... 'we ^served the 'rest of it. R: ... You're ^kidding. B: .. ^No=! ((TRN_DEPR)) (11) -- [2 (tildes)/dashes ] truncated intonation unit This indicates that the speaker breaks off the intonation unit (§2.10) before completing its projected contour- This occurs primarily in cases where a speaker utters the initial portion of an intonation contour, but abandons it before completing it -- that is, in a false start. Double tilde is not intended to represent the case of a unit which appears incomplete when measured against the canons of normative clause grammar. Intonation units which do not constitute complete clauses are of course commonplace: they are frequently marked with a comma at the end, which signals E intonation -- a kind of incompleteness, if you will, but of a variety which is distinct in principle from the truncation signaled by double tilde. The comma-delimited unit typically constitutes (apparently) all that the speaker projected to say within the current unit, while in 18 the double tilde-delimited unit the speaker projected to say more with the current unit, but abandoned some portion of the projected utterance. Truncation is thus measured not against normative notions of clause completeness, but against the speaker's presumed projection for the current unit. Note that virtually every intonation unit should have some intonation contour symbol at the end of it E at the end of every line). If an intonation unit does not have a comma, period, question mark, or exclamation point, Et will in general have a double tilde (Du Bois, forthcoming a) (11a)
A: ... But he's -- ... He's 'decided he wants to be 'ca=lled ^Doc. ((TRN_FARN)) (11b)
J: ... And he= -- .. and he .. ^k=icks my 'feet 'apart, ((TRN_J&J)) (11c)
D: ... 'you know, .. to 'get leads, and 'talk -- .. 'communicate with 'people on the ^phone. ((TRN_CARS)) (11d)
A: ... So I%- -- .. I%- -- .. I ^get in the 'ca=r, ((TRN_DOOR)) (11e)
A: ..(HH) .. And there's -- ... (%) ^Nothing -- .. ^Nothing with two ^tee='s in it, ... does he ^get 'ri=ght. ((TRN_EFARN)) (11f)
R: He 'doesn't have any -- ...(.8) He 'doesn't 'know what's going 'on in this ^world. ((TRN_RANCH)) 19 2.3 Marked quality or prosody (12) < Y words Y> marked quality, tempo, etc. The angle brackets <> can be used (in conjunction with an additional symbol, represented above by X) to indicate that the stretch of text which they enclose has a marked quality of some sort; the particular quality (higher pitch, increased loudness, etc.) is specified by the supplementary symbol. The amount of text enclosed within these symbols often amounts to several words, and can run across several lines. The marked quality is judged relative to the surrounding discourse produced by the same speaker (e.g., a sentence would be marked for tempo if it is noticeably quicker or slower than the speaker's current or usual tempo) - This set of symbols is in principle open-ended, and new ones can be developed to suit a particular investigator's needs. Listed below are some of the more common one5. In our own transcribing, we use these notations sparingly. Also, we use angle brackets to frame only a whole word or group of words; we do not try to place them within a word (e.g. to mark its final syllable as piano) ¶5 (No space appears between the bracket and the supplementary symbol; but a space precedes and follows each symbol pair.) < H H> high: raised pitch < L L> low: lowered pitch < R R> rapid: quicker tempo < S S> slow: slower tempo < F F> forte: increased loudness < P P> piano: decreased loudness < Q Q> quotation: quoted quality <% %> creaky voice, glottalized < MARC MARC> marcato: each word distinct and emphasized < ACC ACC> accelerando: gradual speeding up < DEC DEC> decelerando: gradual slowing down < PAR PAR> parenthetical prosody < WHIS WHIS> whispered words Following are several instances of the above special quality notations. (12a)
M: ...(.9) < WHIS It 'isn't the ^same 'thing WHIS >. X: ... ^Looks like it, ((TRN_LUNCH)) 20 (12b)
A: .. they 'let us 'alone. ...< WHIS 'But we were ^scared, .. And 'boy WHIS >, did we ^ever get in 'trouble, from 'Milt and 'Arnold. ((TRN_AFR)) (13) < P words P> piano: decreased loudness This angle-bracket pair can be used to enclose a stretch of speech which is produced with relatively decreased loudness. (13a)
J: .. < % a=nd I think, < P Well P>, .. this is a 'terrible .. ^technique to use % >. ((TRN_AESTH)) (13b)
R: (%) .. (HH) (%) ... (%) .. 'But .. uh=, ... (3.0) < P 'What was I going to 'say P>. (3.5) X%- -- '0=h, it's ^really 'ti=ring, though. ((TRN_RANCH)) (13c)
S: .. 'you= .. 'aren't ^aware of any of that. J: (HH) Yeah. S: .. [Yeah 1] J: [< P< X Right X>P> 1]. ((TRN_AESTH)) (14) < MARC words MARC> marcato The pair < MARC MARC> can be used for a stretch at marcato speech, in which each word is uttered distinctly and with emphasis. (14a)
J: ... But the 'goldfish got ^s=tuck, ... < MARC 'h=alfway 'into his ^mouth MARC>. ((TRN_J&J)) (15) < Q words Q> quotation This pair indicates direct quotations. Its use is warranted where there is some actual shift in the quality of the stretch of quoted speech, as when the quoting speaker imitates some 21 mannerism of the quoted speaker. (whether the notation is appropriate where no such shift is audible is debatable.) (15a)
J: .. 'This is a ^literal 'quote, .. he 'says to me, ... (HH) < Q I'm 'going to ^res=train 'you. .. to the ^fence Q>. ((TRN_J&J)) (15b)
G: and 'then he'd 'say, .. (HH) < Q 'I 'can't ^believe it. 'Nobody will 'pick me ^up Q>. (TRN_HYPO)) (l5c)
A: and he's 'say=ing, ...(1.7) (TSK) (HH) .. < Q 'A=h, ^yea=h, .. We 'call 'ourselves, the 'special ^forces of Santa 'Monica Q>. ((TRN_FARN)) Note that the quotation symbol is not used for metalanguaqe, such as the name of a letter or a reference to a word. (l5d)
A: and he 'spelt ^hee=l, h e a ^l=, S: .. @ A: and he 'spelt ^said, .. s i a ^d. ((TRN_FARN)) (16) < Y< Z words Z>Y> multiple marked features When a stretch of speech is characterized by two or more coextensive special qualities worth noting, these can be indicated with multiple angle brackets. (The several angle- bracket notations are juxtaposed without any space between them.) (16a)
J: .. So the 'guy 'yells at me, ... (0.9) < Q< F Is 'that your ^dog F>Q> ? ((TRN_J&J)) (16b)
G: .. They're ^drunk. .. < Q< F ^Where's these ^Americans F>Q>? They come ^bursting in the ^room. ((TRN_HYPO)) 22 (16c)
ALL: [ @==1] [@>P> 1]. ((TRN_DOOR)) 2.4 Segmental phonetic detail. (17) % glottal stop, glottalization The percent sign indicates the presence of a (prosodic) glottal stop or glottalization. The percent sign is E written in positions where it is phonologically predictable, e.g. at the beginning of vowel-initial words (under certain conditions) in English. Nor is it written where it is lexically required, as commonly occurs in languages with phonemic glottal stop -- for which a distinct symbol should be used. one reason for taking the trouble to transcribe glottal stop is that speakers often seem to use it when they abandon a word or utterance. If glottal stop functions as an Collective) cue for abandoned utterances, it is useful to have it on record, glottal stop and glottalization may act as a cue to other aspects of the discourse production process as well. (The t is written without surrounding spaces if it is part of a word. If it occurs as an isolated vocal noise, it is written within parentheses, which are surrounded by spaces.) (l7a)
S: ... (%).. < Q It's ^Thanksgiving 'time ^now, ((TRN_AESTH)) (17b)
R: ... 'Down ^the=re, .. u=m, .. it's ^mandatory, .. you have to% -- (%).. to ^graduate, .. you ^know, .. (%) 'well, to ... ^get the degree=, you know, ...(HH) you ^have to 'take this ^class. ((TRN_RANCH)) (17c)
J: ...(2.4) (TSK) that the=% | ... (.8) 'set of ^sentences, ((TRN_AESTH)) 23 (17d)
J: (0) (HH) <% Tha%- .. this% -- .. I ^wonder 'abou=t that though, I mean %>, .. when 'I think of ^a=ds, ((TRN_AESTH)) (18) - (tilde)/dash truncated (uncompleted) word The single tilde indicates that a word is not completed: the end of the word is not uttered. This symbol often occurs in conjunction with a glottal stop, but not always -- either may occur independently of the other. The truncated word in- question can be written out in full to achieve normalization; where it seems significant, the actual pronunciation can be written using phonetic notation (see next item). (No space appears between ward and tilde.) Note that even if none of the segments (phonemes) of a word is entirely absent, a truncation may still be involved if the final segment is cut oft before it reaches the full duration it would have in a normal pronunciation. For example, if the word the is pronounced so that the final vowel is interrupted (e.g. by a glottal stop) before it reaches half the duration it normally would reach, this warrants use of the word truncation symbol (the%-). (18a)
A: But 'it was -- ... till 'five%- -- I 'remember, .. ^fi=ve o'clock | I 'finally got the 'door in, ((TRN_DOOR)) (18b)
J: ... You 'know how they ^do that, so you 'can't s- .. 'ha- -- .. you don't 'have any ^balance. ((TRN_J&J)) (l8c)
N: .. and I 'came up 'behind him, and I wa%- -- .. I was ^hugging him, while he was ^shaving. ...(HH) 'And as ^I was 'hugging him, ...(0.8) 'he just 'sli%- .. ^dropped. ... ^slipped from my 'hands. .. to the floor. he like ^f=ainted. ((TRN_J&J)) 24 (19) ((text)) phonetically transcribed words This symbol complex encloses a representation of the actual pronunciation of a word4 This transcription is given in addition to the traditional orthographic representation of the same word(s), which it follows and to which it is linked by the underscore character (E). The material within the parentheses can be written in a phonemic or broad phonetic transcription in International Phonetic Association CIPA) symbols; in another system for representing pronunciation, such as the system for English phonemic transcription using ordinary roman letters, called UNIBET (MacWhinney l988:32ff); or --where ambiguity will not result -- in standard orthography supplemented by selected phonetic symbols (e.g. stress marks applied to the standard spelling of a word). Phonetic transcription is used only where the actual pronunciation of a word is of special significance for the analyst's purposes. Most of the time standard orthography used alone is sufficient. (No spaces appear between the parentheses and the transcribed segments.) (19a)
J: in= t- 'terms_ ((torms)) .. ^terms of, ((TRN_AESTH)) (19b)
R: ... 'You don't 'really 'realize you're ^progressing_((progressing)). ((TRN_RANCH)) 2.5 Nonverbal vocal sounds. (20) (TEXT) nonverbal vocal sound Single parentheses surrounding a description written in capital letters (COUGH) are used to indicate nonverbal sounds produced in the vocal tract of speech event participants. This encompasses throat-clearing, coughs, clicks, breathing, eta.1 but not dish-washing, finger-drumming, dogs barking, etc. (for which double parentheses are used, §2.7). The reason for distinguishing vocal tract noises made by speech event participants as a special category is that participants often use this channel to give each other subtle cues about aspects of the on-going linguistic interaction, e.g. breathing in to signal the purpose to speak next. Crickets chirping and microphones rustling do not consistently carry such interpersonal meanings for humans. The next few items present common instances of this notation. 25 (21) (TSK) click This indicates the utterance of a click (usually alveolar) as an isolated vocal noise, e.g. what is commonly written tsk in newspaper cartoon style. (21a)
R: .. and ^the=n, ...(1.2) (TSK) (%) ^our 'job, is to 'shape the ^shoe=, ... to the 'horse's ^foot. ((TRN_RANCH)) (21b)
S: ..(HH).. 'u=m, .. (TSK) 'ha=s ... ^something= .. to= |.. ^communicate, .. with 'me=, ((TRN_AESTH)) (22) (THROAT) throat-clearing This indicates the sound made by someone clearing their throat. (22a)
S: (HH) (THROAT) .. Yea=h. ((TRN_AESTH)) (22b)
S: ... (GULP) (TSK) The ^gap is very 'big. ((TRN_AESTH)) (23) (HH) inhalation This indicates audible inhalation. (The number of H's is by convention fixed at two.) (23a)
A, ...(1.0) (HH) 'A=nd, ((TRN_FARN)) (23b)
G: ... (1.4) (HH) .. ^I've got to get 'out of that 'place, man, I 'swear. ((TRN_CARS)) 26 (23c)
K: ... (HH).. @^leukemia=, ... (HH) ^bronchitis, ... (HH)uh=, ... ^tuberculosis, .. @@@@ (HH) .. and he's ^recovered from all. ((TRN_HYPO)) (24) (HHx) exhalation This indicates audible exhalation. (The number at H's is fixed at two.) (24a)
B: ...(4.3) (HHx) ... ^Kids in the 'city | 'miss so 'mu=ch. ((TRN_DEPR)) (24b)
S: (HHx) (TSK) .. an ^artist, (TRN_AESTH)) (24c)
J: ...(1.5) So= .. the%- (HHx) -- ...(2.2)Well. ((TRN_AESTH)) (25) @ laugh syllable This symbol indicates a laugh, produced as a vocal noise separately front any words produced by the same speaker. One token of @ is used per syllable of laughter (when the laughter is brief; for extended laughter, see the following symbol). Note that a laugh can be rhythmically integrated as part of a larger (major) intonation unit, or it can be produced as a separate intonation unit of its own (Du Bois, forthcoming a). (25a)
K, .. @@@@ ... (HH)From which you ^haven't recovered. ((TRN_HYPO)) (25b)
S: ...(1.0) @ (HH) There 'isn't any ^rea=l 'communication going on. J: (0) Yeah. ((TRN_AESTH)) 27 (25c)
A: .. 'That was the ^only thing that went 'smoo=thly, that we've ^ever do=ne. B: .. @ That ^you='ve. ... ^I couldn't even ^begin to do it. ((TRN_DOOR)) (25d)
J: .. The 'conclusion is up to ^you=. S: [ m=hm 2], J: [ @@@ 2] in 'going out to -- (HH) ... to ^buy the thing. (TRN_AESTH)) (26) @== Extended laughter This symbol can be used for laughter of extended duration, when the investigator is not currently interested in indicating how many syllables of laughter there are, or when such indication is not feasible. (26a)
ALL: [ @== 1] D: [ < X< P< @ We 'all like to 'eat @>P>X> 1]. ((TRN_DOOR)) If the actual duration of the laughter is deemed important, it can be timed with a stopwatch and indicated within double parentheses, which are linked to the laughter symbol by an underscore: thus 0=-E((6.2)) would indicate laughter lasting 6.2 seconds. (27) @N nasal laugh This symbol is sometimes used for nasal laughter, in which the air is emitted through the nose. (The unmarked symbol for laughter, however, is simply @.) (27a)
J: ... You're ^not supposed to 'use these 'powerful [ ^techni=ques 1]. S: [ @N@N@N@N 1] (HH) ... Hm=. ((TRN_AESTH)) (28) <@ words @>, @word laughing while speaking The angle bracket pair (either < @ @> or < @N @N>, as appropriate) indicates laughter over a stretch of speaking (the words enclosed between the two @'s or @N's). Ordinarily we use these symbol pairs to frame only a whole word or group of words; we do not try to indicate laughter on particular syllables within 28 a word. If a laugh occurs during the utterance of just one word, this can alternatively be indicated simply by prefixing the word with one "@" sign. (28a)
A: .. (HH) .. and they ^stepped out in the 'road, and ^not only did they have ^uniforms on, but they < @ 'also had ^gun=s= @>. [ @@@ 1] B: [ (HHx) 1] ((TRN_AFR)) (28b)
S: (0) It's @^pleasing (HHx) ((TRN_AESTH)) (28c)
K: .. @ G: ... @ There isn't -- It's < @ ^no 'disea=se, at 'a=ll @>. K: .. 'Athletic feet. ... @N .. 'foot. D: .. @N .. @'foot. ((TRN_HYPO)) (28d)
N: 'You know, 'this was a 'rented @^snake, @, ((TRN_J&J)) 2.6 Filled pause and backchannel words. The following list presents a set of orthographic conventions for spelling sounds used in filled pauses, backahannel, and so on, in spoken English. The purpose of the list is to standardize the spelling of sounds and words that don't ordinarily appear in English dictionaries, so that they can be transcribed consistently and identified systematically by computer. The conventions are based roughly on those used in American newspaper cartoons. (The glosses are given only to suggest to the reader which sound is meant, and are not intended as actual analyses of discourse functions.) In these notations, 'in roughly indicates nasalization of the preceding vowel, and - (hyphen) corresponds to a glottal stop In actual transcriptions, the lengthening symbol (=) very often occurs in these words. 29 (29) uh hesitation (filled pause) um unh m backchannel, awareness, wonder hm huh hunh mhm affirmative response (final syllable stressed) unhunh uhuh unh-unh negative response (initial syllable stressed) uh-oh alarm cry (29a)
J: .. I 'think of |... (1.2) 'aesthetics. ..@ @a=nd, S: .. M=hm=, J: u=h, S: ... (1.5) 'Hm=. ... @ J: ... 'creation of ^desi=re, .. for ^one thi=ng. S: m=hm=, ((TRN_AESTH)) (29b)
J: .. (HH) .. And I thought, ...(0.7) < Q ^Uh-oh= Q> ((TRN_J&J)) 2.7 Transcriber's perspective. (30) < X words X> uncertain hearing This pair encloses portions of the text which are not clearly audible. The words so enclosed represent the transcriber's best guess as to what was said, but their accuracy is not assured. (30a)
J: .. < X I mean X> 'why do people actually ^wa=lk .. 'into=, (HH) ^art museums. ((TRN_AESTH)) (30b)
G: ...(l.2) Well, I [ ^don't 1] 'normally 'sound like ^Lucille 'Ball. K: [ < X That's X> 1] -- ((TRN_HYFO)) 30 (31) X indecipherable syllable The capital letter X indicates segments of speech which are not audible enough to allow a reasonable guess at what was said. One X is used far each syllable of indecipherable speech- (It is usually possible to make at least a rough estimate of how many syllables were uttered, even when one can't make out what the words are.) (Such X's are written alone, without the angle bracket-X symbol which indicates an uncertain hearing.) (31a)
A: (0) It's ^some 'story, XX. ((TRN_DOOR)) (31b)
D: .. It was 'basically ^me=, 'you know, X 'going ^out. .. The 'problem of going ^out. ((TRN_CARS)) (31c)
A: .. And he's got < P ^all this, .. < X 'you know X>P>, ... and 'everything ^else X, ((TRN_FARN)) (32) ((COMMENT)) researcher 's comment This pair encloses any comment the transcriber or researcher chooses to make. It can be used as well to note the occurrence of noises not made in the human vocal tract, though such sounds are usually written only if they are relevant to the human interaction at hand (as when speech event participants comment on or otherwise react to the noise) E Comments are best kept short. Writing comments in all capital letters helps to visually distinguish these words from the words actually uttered by speech event participants (§1.3.1) One common comment, as standardized in brief form, is ((MIC)), which indicates noise from the microphone when it is moved (e.g. by the investigator.)E 31 (32a)
N: .. the ^way that | .. the 'Indians ^li=ve, .. like Cany%- .. [ Canyon de 1] 'Chelly= ? X: [ ((BLOWS_WHISTLE)) 1] J: ... < P It's a 'whistle P>. N: ... The ^way that the 'Indians ^li=ve, ... (HH) is ^incredible. .. They 'still 'live, .. u=m, .. 'mi=les and 'mi=les ^apart from each other, .. in ^ho=gans, .. (HH) And they're s- .. 'intersper=sed, .. and% -- .. and they're=, ...(.8) 'you know, ...(.9) (DOG_BARKS_EXCITEDLY)) .. @@@@@ ..(HH) @@@ (HH) (HHx) J: You 'know% -- .. You 'know%, .. about ^this 'piece ? N: .. < PAR 'She ^always does that PAR>. ((REF_TO_DOG)) ((TRN_J&J)) (32b)
J, (0) 'I spend a 'lot of ti=me, ((MIC)) ...(1.O) ^analyzing 'a=ds, .. 'myself, ((TRN_AESTH)) (32c)
A, .. ^Think of your 'door, .. ^here. ((GESTURES?)) ((TRN_DOOR)) If it is important to make clear that a given comment applies just to a certain stretch of speech, this can be indicated by enclosing the relevant stretch of speech in angle brackets, and writing the comment within the usual double parentheses. A numerical index is then attached to both the angle brackets and the associated comment, as follows: <1 words 1> ((COMMENT)) 32 2.8 Turns and overlap. (33) A: the speaker is A The speaker of a given line of the transcription is indicated by a code or a proper name (written all in capital letters) at the start of the turn or backchannel (as the first item in the line, to the left of the spoken words). Successive lines uttered by the same speaker need not be so marked. The speaker code or name is followed, without an intervening space, by a semicolon. (At least one space or tab appears between the semicolon and the beginning of the text.)24 While speakers can be represented by codes like "A" or "B", one often gets a clearer impression of who the participants are if their utterances are tagged with personal names, which are more memorable. The names can be the speakers' own, or made-up names, depending on privacy considerations. Names are especially important if the speakers use the names to refer to each other during the course of a conversation -- in which case, obviously, the (made-up) name in the speaker label should match the (made- up) name in the speech (§2.9). (when it is unclear which of several speakers on a tape is responsible for a particular utterance or noise, the symbol "X:" is used to label the unidentified speaker.) (33a)
A: .. 'No=w that we have the [ ^si=de door 1] fixed, he could. B: [ That's 'kind of 1] -- .. Yea=h, C: (0) @Yeah (HHx). D: ... Sure. ((TRN_DOOR)) (33b)
JACK: 'That's all it ^does. .. It 'doesn't [ .. even 1] ^reach a 'conclusion. SANDY: [ m=hm 1], JACK: .. The 'conclusion is up to ^you=. SANDY: [ m=hm 2], JACK: [ @@@ 2] in 'going out to -- (HH) ... to ^buy the thing. SANDY: .. 'Hm=. .. 'Hm. (HH) ...(l.0) O=kay=. ((TRN_AESTH)) (33c)
x: [ ((BLOWS WHISTLE)) 1] ((TRN_J&J)) 33 (34) [ words n] speech overlap (new convention: [n words n]) Square brackets are used to indicate the beginning (left bracket) and ending (right bracket) of overlap between the utterances of two speakers. One set of brackets is inserted surrounding the first speaker's overlapping utterance portion, and a second set of brackets surrounds the second speaker's overlapping portion. This notation signals that the two bracketed utterance portions were uttered at the same time. A numerical index (n-i, 2, 3, ...) is then assigned to the overlap, and is inserted into each speaker's overlap (prefixed to the right bracket that marks the ending of the overlap). If several overlaps occur within a short stretch of text, these index numbers serve to mark which bracketed text portions go together; successive numbers are used to make clear what is overlapping with what. When there is no danger of confusion (i.e. after a stretch with no overlaps), numbering should restart with 1. We do not put square brackets within a word. That is, we do not try to indicate the exact syllable or segment where overlap begins and ends, since we have found that such precision is difficult to achieve reliably, and for our purposes may not merit the additional time spent. (It also makes transcriptions harder to read.) If a substantial portion of a word overlaps, it is included within the brackets; if only a small portion overlaps, it is not. (34a)
B: ... I 'remember, ...(.8) I 'used to 'help Benny, and I'd get ^twenty-five 'cents a 'week. R: ... (1.2) [A ^week l]! B: [ 'Twenty 1] -- ((TRN_DEPR)) (34b)
B: ... 'They were kind of ^scary. ...(1.6) the [ 'gypsies 1]. R: [ mhm 1], ((TRN_DEPR)) (34c)
A: .. (HH) 'But, .. [ the 'thing ab- 1] -- B: [ The 'spe=cial 1] ^f=orces ! A: (0) 'Yea=h. ... [ But the 'thing ^about him 2] -- B: [ This 'place is getting 2] ^wei=rd. ((TRN_FARN)) 34 (34d)
G: ... (.7) Well, the ^worst [ thing | 'I ^ever had, was ^brai=n 1] fever, K: [ @ @^He's a 'medical 'miracle 1]. G: when I < X had X> [ proposed 2] to ^her. D: [ @@ 2] K: .. @@@@ ... (HH) From which you ^haven't recovered ((TRN_HYPO)) (34e)
B: ... But 'I thought ^Mom was 'raising= | ... (.7) ^hemp, or, ...(1.1) [ 'something 1] one time. R: [ ^what 1] ? ... [ ^Hemp 2]. B: [ 'Hemp 2]. ((TRN_DEPR)) (34f)
B: (0) 'cliff is ^still | .. 'screaming about ^tha=t, R: ... [ Because he 'wanted the ^stamps 1], B: [ all those ^stamps 1], ... 'Mom let ^Tim 'Canon have. ((TRN_DEPR)) (34g)
J: .. [ 'yeah 1]. S: [Which= 1] .. ^colors ... ^a=ll of the 'communication, [after 1] that. J: [ Yeah 1] . ((TRN_AESTH)) (35) ____ overlap placeholder Given that in the present transcription system, the intonation unit must not be fragmented onto two different lines (§2.10), it is sometimes useful, in cases or complex speech overlap, to have a symbol that can be placed within one speaker's intonation unit as a placeholder with which another Speaker's words can "overlap". For further explanation of the conditions which warrant use of this symbol, see §2.10 and Du Bois (forthcoming a). 35 (35a)
J: (HH) ^why [ did 1] people ^tra=sh that% -- S: [ yeah 1], J: .. [ the% 2] -- S: [ unhunh 2], J: .. you know, whe=n .. < PAR u=h PAR> .. 'Stravinsky had his .. [ ___ 3] ^premie=re, S: [ m=hm 3], m=hm, ((TRN_AESTH)) (35b)
J: (0) Tha%- .. 'that's t- | .. where [ the 2] ^co=gnitive .. [ 'bias .. 3] | kind of [ ___ 4] .. (HH) [ ^conce=rns 5] me. S: [ mhm 2], [ mhm= 3], [ mhm 4], [ 'Hm= 5]. ((TRN_AESTH)) (36) (0) latching This symbol (zero in single parentheses) indicates that the following utterance latches the preceding utterance (i.e. there is no pause -- or zero pause -- between the two speakers' turns). Since it symbolizes a noticeable lack of pause between actual turns, mere continuative backchannel re2sEonses (E, etc.) are not ordinarily marked with this symbol (36a)
A: They 'get their 'snake? R: (0) ^Yeah! ((TRN_AFR)) (36b)
G: ... < X Least X> she'll 'know what her ^good thing was. D: ... 'Yea=h. G: (0) ^That's for sure, D: (0) 'Definitely. ((TRN_CARS)) 36 (36c)
G: .. I was 'using number ^seven, .. 'gun number ^seven, D: (0) It ^broke the [ 'chisel 1]. G: [ and 1] it ^broke my 'chisel, man. < X Now X> -- D: (0) So 'now you have 'no chisel. G: (0) < X It's X> my ^only good 'chisel. man, ((TRN_CARS)) 2.9 Miscellaneous (37) ZZZ code for suppressed proper names
The capital letter z is occasionally used to replace censored proper names in the text (one Z per syllable of replaced text). Note that in most cases (especially where there is more than one name needing to be distinguished) it is preferable to make up names that retain some flavor of the original names (§248). (37a) ZZ ZZ (could stand for the speaker's utterance of, e.g. the words "Edward Sapir") (37b)
A: ...(.7) His ^name is= | .. ^Z . ((TRN_FARN)) (37c)
S: .. (HH) (TSK) He ^would be 'just about 'Z 'Z's a=ge. ((TRN_AESTH)) 2.10 prosodic units The symbols in this section are used to delimit prosodic units at various levels. They represent the boundaries between the units. (Discourse can also be usefully segmented into marphosyntactic and other kinds of units; see §2.13.3.) (38) CARRIAGE RETURN intonation unit boundary The end of an intonation unit (or the boundary between two intonation units) is indicated by a carriage return. Thus each intonation unit appears on a separate line. (For a definition of the intonation unit and a discussion of the cues for identifying it, see Chafe (forthcoming), Du Bois (forthcoming a), and Cruttenden (1986:35-45).) (No space appears between the carriage return and the final character in the line.) 37 (38a)
A: 'Well, .. ^this is in ... 'bits and ^pieces , ((MIC)) but I was 'coming 'down the ^stai=rs, and he was there ^ta=lking, .. to this ^lady. ((TRN_FARN)) (38b)
S: (HHx) 'That's ^interesting , .. I mean, th%- that you should ^pai=r the word 'aesthetics, ... with [ ^advertising 1]. J: [ (HH) 1] ^Yea=h ! ((TRN_AESTH)) (38c)
A: for a ^new doo=r, and ^door ja=mbs, ^ha=rdwa=re, ^stai=n, ^pai=nt .. 'all the ^stuff that you 'nee=d, ((TRN_DOOR)) (38d)
M: ... It's that ^you=ng, .. [ ^pa=le ], A: [ 'yeah 1]. H: .. 'guy with the ^da=rk 'hair. ((TRN_FARN) Note that a speaker's intonation unit should not be broken up into two lines even if another speaker's utterance intrudes between the intonation unit's beginning and its end In dealing with such cases the overlap placeholder symbol " ___" (underscores) is sometimes useful (§2.8). (39) intonation subunit boundary This symbol (pipe) separates one intonational subunit from the next, within one intonation unit It is used where the intonation contour almost seems to warrant recognition of a new intonation unit, but not quite -- that is, where the unit has some of the features of a prototypical intonation unit, but not all. Needless to say, this is often a matter of close judgement, and should be evaluated accordingly. Some discourse researchers prefer not to use a concept of intonation subunit, and so would not use this symbol This symbol is by convention associated with the following text, so that it precedes any pause which is associated with the following unit (§2-1). 38 (39a)
S: ... [ 'Well 1], A: [ You're 'off 1] the ^highway, 'aren't you | ^here ? ((TRN_FARN)) (39b)
A: ... The 'hinge is | .. on the ^inside. B: (0) Right. ((TRN_DOOR)) (39c)
S: .. (HH) So= that the= .. ^reason | 'why I'm being 'communicated with, .. 'i=s | so that 'I can be 'made to ^do something. (TRN_AESTH)) (39d)
A: which was ^like a | ... (HH) ^Workmate 'be=nch, .. type ^deal, with a 'gui=de, and everything, ((TRN_DOOR)) (40) SPACE word boundary Although in principle the word boundary pertains as much to morphosyntactic segmentation (§2.13.3) as to prosodic segmentation, it is so much taken tar granted as a feature of any transcription that it is included here with the other basic discourse transcription notations. The space character is used to separate lexical words, as in normal orthographic convention. A space also separates other word-equivalent (for computer sorting purposes) symbols, such as punctuation, brackets, etc. As noted above (§1.3.4), for computational purposes it is useful to follow consistent conventions in inserting spaces in a transcription. Therefore, throughout this document we have commented on where spaces should and should not go. In the following example, each of the space-delimited strings is treated computationally as a word, allowing appropriate coding to be attached to the symbols for speaker code, latching, backchannel response, pause, audible inhalation, final intonation contour, etc., if desired. (40a)
S: (0) Hm=. .. Hm. (HH) ...(l.0) O=kay=. ((TRN_AESTH)) 39 2.11 Capitalization. (41) Capital "sentence" beginning Application of standard literary conventions for capitalization of word-initial letters -- beyond those governing proper names,28 which this transcription naturally follows -- presents a problem to the degree that the "sentences" of spoken discourse, if such units exist, do not neatly correspond to the sentences of written discourse. Punctuation symbols (period, comma, etc-) are used to indicate intonation contour, but the unit which in the spoken discourse transcriptions is delimited between two period symbols does not often correspond directly to a standard written sentence. Moreover, the resulting transcription does not always make for easy reading, to the extent that the punctuation symbols, given their intonational value, are not available to effectively cue the reader to any sentence structure per se. For these reasons, a capital initial letter is used to indicate the apparent beginning of a new sentence-like unit: perhaps the start of a new proposition, or a new speech act. Unlike in writing, there need not be any absolute correlation between a period at the end of one line and a capital at the beginning of the next. In fact, a very common configuration is a comma (,) or double tilde (--) at the end of the first line followed by a capital at the beginning of the second. Since the capital letter is taken to mark empty the beginning of one of these sentences, and not necessarily the end of the previous one, there is no need for the previous sentence to have been brought to a full conclusion. Thus several false-start intonation units in a row, each beginning (or attempting to begin) the same sentence, are each written with an initial capital, even if only the last of the units is ultimately brought to completion as a full sentence. It is important to emphasize that since capitalization is not claimed to mark prosody (already marked by punctuation symbols), its primary use in the present system is to provide a rough reel for something of the spoken discourse's sentence unit boundaries (possibly correlated with conceptual, speech act, or rhetorical units) , and thus to make the transcription more readable. It should be kept in mind, however, that the nature or the contrast signaled by capitalization is not easy to codify precisely. There is no claim that the capital letters consistently correspond to a specific acoustic cue in the audio record, nor that they are even necessarily audible. Neither is any hard and fast structural or functional analysis intended. In this sense capitalization is simply a rough display device which is available for use at the transcriber9s or researcher's discretion, and should be interpreted in this light.29 40 (41a)
K: (HH) .. But ^he'll recover. He'll% --. D: (0) what ^is that. K: (0) 'He'll be over his leprosy [ ^soo=n 1]. G: [ 'Nothing 1], it's just 'dry ^ski=n. K: .. @ G: ... @ There isn't -- It's <@ ^no 'disea=se, at 'a=11 @> K: .. 'Athletic feet. ... @N .. 'foot. D: .. @N .. @'foot. ((TRN_HYPO)) 2.12 False start. (42) <FS words FS> false start For a widely-known language like English it is probably best to avoid inserting implicit judgments about correctness and repair at the transcription level (Edwards 1967). (Such interpretations are of course commonplace, and fully appropriate, at the more interpretive and theory-bound level of E.) But the picture changes when one considers little-known languages. A linguist who publishes a transcription of a language that is known by only a few individuals in the world would do a decided disservice to simply reproduce all the words as spoken, without any indication of which were considered correct and which were not, in the eyes of the native speaker. This is, after all, the kind of speaker knowledge which native speakers of English make use of without thinking when they read and understand a transcription in English which does not overtly alert them to the disfluencies it contains. But in a little-known language, such knowledge may well be inaccessible to any but the linguist who published the text and one or more native speakers in a faraway place. One solution that has often been adopted is to edit out disfluencies in the text, in accordance with judgments of a native speaker. While this kind of edited text is appropriate for some purposes (e.g. publication of indigenous literature as the native author would have it presented)1 for serious spoken discourse research (of the sort that takes account of the process of discourse production), it is obviously preferable to retain every word exactly as uttered. If care is taken to indicate, for the benefit of the non-native speaker, which items are editable, these readers can then have the best of both worlds -- they can skip over the (marked) false starts to obtain an edited version, and include then to better understand the discourse production process. But it the distinction between false starts and 41 natively ratified material is not indicated, no one who lacks access to a native speaker can reliably reconstruct this information. Thus, while one probably should not specially mark false starts in a transcription of English discourse, one should do so in, for example, a language like Xinca or Sacapultec Maya. The angle bracket notation is made available for this purpose. (English examples are presented below with this notation just to illustrate how it would be used.) (42a)
A: .. < FS He has= FS> -- < FS a% FS> -- The ^spelling is what 'first 'turned me on ^to him. ((TRN_FARN)) (42b)
A: and < FS they% FS> -- .. they% .. ^poked into the%- | .. the ^mou=lding, along the [ 'side 1]. B: [ unhunh 1], ((TRN_DOOR)) (42c)
G: ...' A=nd, .. 'you know, .. < FS 'He= would like FS>, .. (HH) 'He would like, ^w=alk out on the ^freeway, and 'try to ^hitchhike, ((TRN_HYPO)) (42d)
J: [ @@@ 2] in 'going out < FS to FS> -- (HH) ... to ^buy the thing. ((TRN_AESTH)) 2.13 Reserved symbols. Some of the symbols that are not used in transcribing need to be reserved for other important uses. Bookkeeping, phonemic orthography, and morphosyntactic coding, all call for the use of some specialized symbols. Each of these domains is addressed below. In addition, a few symbols are left undefined, free to accommodate the diverse special needs of users of the system. 42