Digital humanities


Maintained by: David J. Birnbaum (djbpitt@gmail.com) [Creative Commons BY-NC-SA 3.0 Unported License] Last modified: 2023-02-08T14:26:07+0000


HTML basics

General

All HTML files created in this course must be HTML5 using XML syntax. That means that:

HTML5 was elevated to recommendation status only in October 2014, and there are a lot of legacy pages on this site and elsewhere that use earlier versions of HTML. Don’t just copy one of those; new content must be HTML5 using XML syntax.

HTML5 skeleton

An HTML5 document with XML syntax has the following skeleton:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
    <head>
        <title></title>
    </head>
    <body>
    </body>
</html>

The <head> element at the top of the document is for information about the page (metadata), including the <title> element, which will be displayed in the tab or the title bar in the browser and not inside the window that displays your page. It is only the content of the HTML <body> element that shows up inside the browser window.

Important HTML elements

You can look up the various HTML elements at the w3schools site, but most sites are built from a very small number of elements:

Paragraphs

Paragraphs in HTML are <p> elements.

Lists

Lists in HTML have two parts:

For both ordered (numbered) and unordered (bulleted) lists, each individual list item is an <li> element. Note that whether the list items are preceded by numbers or bullets is typically configured not at the level of the individual item, but by whether you choose <ol> or <ul> as your wrapper. For example, the following code produces an ordered (numbered) list:

<ol>
    <li>First item</li>
    <li>Second item</li>
    <li>Third item</li>
</ol>

To change it to an unordered (bulleted) list, change the wrapper from <ol> to <ul>. The <li> elements inside are the same for both types of lists.

Headings

There are six levels of headings in HTML: <h1>, <h2><h6>. The first of these is very large and very bold, and they get progressively smaller and lighter. They are intended to be used hierarchically; the top-level heading on your page is typically an <h1> (and there should be only one because top-level means applies to the whole page), major subsections use <h2> as headings, etc. Don’t use heading levels just for presentational effects; use them hierarchically, and control the appearance with CSS.

Tables

Tables (<table>) in HTML consist of rows (<tr>) that, in turn, consist of cells (<td>). A table is created by using the <table> element. Do not use the @border attribute, which is invalid in HTML5 with XML syntax; table borders must be configured only with CSS. Inside the <table> element you create rows with the <tr> element, and inside the rows you create cells with the <td> element. The following code creates a two-row, three-column table:

<table>
    <tr>
        <td>First row, left-most column</td>
        <td>First row, middle column</td>
        <td>First row, right-most column</td>
    </tr>
    <tr>
        <td>Second row, left-most column</td>
        <td>Second row, middle column</td>
        <td>Second row, right-most column</td>
    </tr>
</table>

For the labels at the tops of columns in a table, use <th> (table header) tags instead of <td>; this reflects the meaning (they are headings, not data), and browsers by default will render <th> elements as bold and centered, which is a natural way to label a column. There are additional attributes that you can use to create cells that span multiple columns or mulitple rows.

Inline elements

The preceding elements are all block-level. which means that they don’t normally nest within one another. That is, paragraphs, lists, headings, and tables typically begin on new lines and end before the next block begins. There are also inline elements, which typically occur inside block-level elements. In our experience, the most useful inline elements are:

  1. <em> and <strong>: emphasis and strong emphasis, respectively. These are typically rendered in browsers as italic (for <em>) and bold (for <strong>). We prefer to use these semantic elements when the semantics are relevant, and to use CSS when the italics or bolding don’t represent emphasis or strong emphasis.

    HTML also has <i> and <b> elements, which represent italic and bold, respectively. These elements are presentational, rather than structural, and HTML markup should normally be used for structure, with presentation controlled by CSS.

  2. <a>: link. A string of text tagged as the element <a> serves as a clickable link, and the target is specified as an @href attribute. For example:

    <a href="http://dh.obdurodon.org">click here</a>

    will create a string of text that reads click here and that, when clicked, will take the user to the specified address.

Generic blocks and inline spans

There are times when you may need to demarcate a portion of your document solely to associate style with it using CSS. For those purposes HTML provides a generic block element (<div>) and a generic inline element (<span>). You can use <div>, for example, to contain several paragraphs to which you want to assign the same CSS, and you can use <span> to assign CSS to a few words in the middle of a paragraph that don’t naturally fit the other available inline elements.

New sectioning elements added to HTML5

Before HTML5, navigation menus, tabbed panels, headers, footers, etc. were implemented by relying heavily on <div> and <span> elements with different @id and @class attributes, using CSS and JavaScript to control their styles and behaviors. This approach works up to a point, but it has at least two disadvantages (for more detail see The importance of HTML5 sectioning elements):

HTML5 introduced new sectioning elements in an attempt to standardize this aspect of tagging document structure. Some of the new sectioning elements are:

Element Usage
<main> Contains the main content of the body of a document. There can be only one <main> element per document.
<header> Introduction of an article, a section, or the entire document.
<footer> The footer of a site, long article, or section.
<article> Independent content item (e.g., blog entry, article, forum post, etc.). According to the W3C specification, the contents of an HTML5 <article> element are in principle, independently distributable or reusable, e.g. in syndication.
<section> Generic section that may be used to group different articles depending on their purposes or subjects, or to divide a single article into sections. If you view the underlying HTML5 source of this page, you’ll see that we’ve used the <section> element in this latter function.
<nav> Contains the main navigation links.
<aside> Contains additional information not be directly related to the main content.
<figure> Tags a figure (with optional caption, etc.) as a single item.
<figcaption> Contains a caption for a figure.
<time> Used to tag dates and times.

Note that these tags describe the function of the elements and not their appearance. Web browers do not automatically render <header> elements at the top of the page or <footer> elements at the bottom or <aside> elements on the side, etc.. In fact, these elements have no inherent rendering properties, and therefore no effect on the way the page is displayed in the brower. These new elements are about encoding structure, rather than appearance, and it is the responsibility of the developer to use CSS to control the rendering of the page components, as was also the case before HTML5.