This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
is used to indicate the start of a new paragraph. However, many codes also have a corresponding closing or end tag, which is demonstrated by the use of a forward slash / sign after the less than symbol. So the end of a paragraph would be encoded as. Elements may also contain what are called ‘attributes’. For example, the code <pause dur=4> could be used in a spoken transcription to indicate the occurrence of a pause during speech, the duration being 4 seconds. Here, the attribute is dur (duration) and its value is 4 (seconds). Different forms of SGML have been employed for a range of purposes. So HTML (Hyper Text Markup Language) uses a predefined set of codes based around the general SGML rules. For example, bold print is specified in HTML with the code pair and . See Bryan (1988) and Goldfarb (1990) for more information about SGML. (See also Text Encoding Initiative (TEI).) standardisation When comparing frequencies across or within corpora, it is often useful to standardise the results in order to take into account the fact that files or corpora may be of different sizes. This can be achieved by expressing frequencies as a percentage or as occurrences per x words. Table 8 shows frequencies for the six age groupings in the British National Corpus (BNC) for the word cheerio.