The structure of a Web page

See more about:

A template

Web pages are plain text files. Most have more-or-less the same structure. Here it is:

<!DOCTYPE HTML PUBLIC  "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <title>TITLE</title>
  </head>
  <body>
    BODY
  </body>
</html>

Figure 1. Standard page template

The page is made up of tags. Tags use < and >. Most tags come in pairs, like <title></title>. The first part – <title> – opens the tag. The second part – </title> – closes the tag. You’ll see this pattern a lot. Open tag, close tag, open tag, close tag, open tag, close tag.

Exercise: Upload the template

Upload the code in Figure 1 to your server. I’ll lead you through it this time.

Do the following:

  • Make a file on your computer, with the template code from Figure 1 in it.
  • Upload the file to your server.
  • Look at the result in your browser.

Making the file

Start an editor, like Notepad++. Download and install Notepad++ if you haven’t already. It’s free.

The Notepad++ Web site can be a little confusing. Here’s a direct link to the download page. Choose the installer program. As I’m writing this, the name of that file is npp.5.5.Installer.exe. Download that file, and run it. It will install Notepad++.

Remember, do not use Microsoft Word! Use a plain text editor. Word and other word processors add lots of cruft to files, and that will mess up your work.

Run Notepad++, or another editor you choose. Notepad++ will open with an empty file.

Copy the code in Figure 1. There’s an easy way to do this that doesn’t get the line numbers. Rest the mouse anywhere in the figure. A toolbar appears in the upper right corner of the figure:

Figure toolbar

The second button will copy the code in the figure to the clipboard, without any of the line numbers. W00f!

Copying to the clipboard

Paste the code into Notepad++. Save the file into a new directory on your computer. Call the file template.html, or some such. Remember to use only lowercase letters in the file name. This is because your Web server probably runs Unix, which is case-sensitive. It thinks that template.html and Template.html are different files.

Create a new directory for each exercise. For example, within your Documents directory, you might create a directory called coredogs, and within that a directory called clientcore (that’s the book you’re in), and within that a directory called web-page-with-text (that’s the lesson within the book) and within that a directory called upload-template (for this exercise).

Try to keep things organized. It’s a little more effort at the beginning, but it will save you from headaches later.

Upload your file

Start your FTP program, like WinSCP.

Connect to your Web server. We talked about this process earlier. Here’s a reminder.

When WinSCP starts, it will show you a Login dialog. click New, and enter your connection information from the email you got from your Web hosting company. Like this:

Connecting to your server

You’ll see a split screen, like this:

FTP split screen

Remember that only files under your Web root will be accessible on the Web. On Hostgator, the Web root is usually called www or public_html. So that’s where your file should go.

Create a new directory on your server (under your Web root) for CoreDogs projects. You might call it coredogs. Within that, create separate directories for each project.

To create a directory, navigate to the parent. So if I wanted to create coredogs under www with WinSCP, I’d double-click on www, and create the directory there.

In WinSCP, press F7 to create a directory, or use the button at the bottom of the window:

Create directory button

You’ll see a dialog that lets you type the name of the directory:

Create directory dialog

Click OK, and the new directory will appear. W00f! Double-click on the directory to open it. You can create more directories under that one, if you want.

As with the files on your own computer, I recommend creating a separate directory for each book (e.g., clientcore), and within that a directory for each lesson (e.g., web-page-with-text), and within that a directory for each exercise (e.g., basic-template). That would give you a path like /www/coredogs/clientcore/web-page-with-text/basic-template. This seems like a lot of work, but it’s better than accidentally erasing things you need.

Time to upload the file. Find the file you created (maybe you called it template.html) in the left window. Remember that the left window is the file system on your computer.

WinSCP has a drop-down that gives you quick access to your Desktop, documents, drives, and such:

Local drives and such

Use it (and the directory tree below it) to navigate to where you stored the file you created (template.html or whatever you called it).

Local directory

Once there, upload a file with good old drag-and-drop, from the left to the right:

FTP split screen

You should see your file on the server. W00f!

Look at the file in your browser

Open up a browser. Type in the URL of the file you just uploaded. For example, if your site was drewid.nom, you might enter http://drewid.nom/coredogs/clientcore/web-page-with-text/upload-template/template.html. We talked earlier about the relationship between server files and URLs.

You should see something like this in your browser:

Template displayed

W00f! W00f! W00fy-w00f-w00f! With w00f sauce!

Later, if you forget the basics of creating and uploading a page, come back to this exercise.

(Log in to enter your solution to this exercise.)

Nesting

Tags are nested, that is, some tags are inside other tags. It’s important to get the nesting right. Inner tags should be closed before outer tags.

Tag nesting

Figure 2. Tag nesting

What happens if you violate nesting? Well, maybe nothing, maybe something. Different browsers handle invalid markup differently. Sometimes it will look fine in Internet Explorer, but not in Firefox. Or it might look OK on a Mac, but not a PC. It’s hard to tell without looking at all the combinations.

Webers strive for predictability. They want to create a page once, then have it work on every browser, on every operating system. That isn’t always feasible. But the more closely you follow the rules, the better off you’ll be.

Indenting

Webers use consistent indenting to make markup easier to read, and errors easier to spot.

Indenting

Figure 3. Indenting

Browsers don’t care about indenting. Both pieces of code in Figure 3 would render identically in a browser. But the first one is easier to follow.

Renata
Renata

If it looks the same to the user, why would it matter? In the end, it’s the user experience that’s key.

CC
CC

Can I answer that one?

Kieran
Kieran

Sure.

CC
CC

I’ve worked in a few different businesses. The one thing you can count on is change.

This page might be fine today. But tomorrow, a marketing type is going to want to change the text. And next week, it will be different again.

The easier the markup is to read, the easier it will be to change. And the fewer mistakes you’ll make.

CC is right. You need to learn to think like a Weber. Webers are thinking not only about the results (that is, the page), but about the work processes they use to create the results. They try to make the work processes fast and accurate. The indenting in Figure 3 would help.

We’ll see this again and again.

DOCTYPE

Let’s look at the tags in Figure 1. Here it is again.

<!DOCTYPE HTML PUBLIC  "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <title>TITLE</title>
  </head>
  <body>
    BODY
  </body>
</html>

Figure 1 (again). Standard page template

The first line tells the browser what HTML standard we are using. The World Wide Web Consortium (W3C) creates standards for various things, including HTML. The current standard is HTML 4.01, though HTML 5 is waiting in the wings.

Line 1 says we’ll use the 4.01 standard, and we’ll be complying with it strictly. “strict” will give us the most predictable results across browsers.

The <head> section

All of the HTML is between the <html> tags on lines 2 and 10. Notice that it’s a matched pair, as are most tags.

The page has two sections: the <head> section (lines 3 – 6), and the <body> section (lines 7 – 9). The <head> section contains metadata, that is, data describing the page.

Character set

Line 4 tells the browser what character set the page will use. Long ago, computers could only store upper- and lowercase letters (A-Z and a-z), digits (0-9), and a few symbols (&, @, !, space, etc.). This was called the ASCII character set. Fine for most people in the United States, but not for everybody else.

After a while, characters like é and ß started appearing on computer screens. This was the ISO-8859-1 character set, or something similar. An improvement, but still not great. What about Cyrillic and Chinese characters? Huh? Huh?

Today, there are characters sets that include thousands of glyphs (a glyph is a physical image for a character). UTF-8 is a common one.

The two most common character sets on Web pages are ISO-8859-1 and UTF-8. The former works for Western languages, and many Webers use it. UTF-8 is slowly taking over, however. Everything that is in ISO-8859-1 is in UTF-8. We’ll use UTF-8.

The character set definition is typical of the tags in the head section. It tells the browser about the page, but doesn’t tell the browser what to show on the page. Line 4 tells the browser that the page content could contain some strange characters. But it doesn’t tell the browser what characters it will be required to show.

The <title>

Line 5 is the <title> tag. It doesn’t affect the main area of the page itself, but it does show up. Where? In line 5, the text for the title is TITLE. Have a look at the page template in Figure 1 in your browser. Where do you see TITLE?

Find it yet? I’ll wait.

Do do do do-do do do doo, do do do do doo, do-do-do-do-do do do do do-do do do doo, do, do-do do do, do, do, dum dum.

Yes, it’s in the top area of the browser’s window.

Page title

Figure 4. Page title

This tells the user what the page is about. The content comes later.

The title appears in a couple of other places as well. First, if the user bookmarks the page, the title will show up in the bookmark list.

Second, the title will appear in search engine listings for the page.

You may have seen search engines show “Untitled” for a page. This means that a Weber forgot a <title> tag.

Search engines also use the <title> tag to figure out what the page is about. If your title is “Dogs of Doom,” and a Googler searches for “doom dogs,” there’s a good chance Google will show your page. All because of the title.

The <title> tag is probably the most important tag in SEO, or search engine optimization. This is the art of getting your pages to rank high in search engine results.

A broken page

While researching this lesson, I came across some interesting code. I saw the following in Google:

Google listing

Figure 5. Google listing

Strange title for a page! No way to know what the page would be about, just from the title.

I looked at the code of the page, and saw this:

Broken title

Figure 6. Broken title

The page starts off OK, except for the missing DOCTYPE tag. It opens a title tag at (1). But then there’s a DOCTYPE! Huh? A complete page is embedded in the title tag of a page!

The real title is at (2), but the browser can’t pick it up. The code is so messed up that the browser can’t figure it out. And neither could the Google search engine.

This shows you the importance of well-formed markup, that is, markup that follows the rules. The code in Figure 6 is not well-formed. It confuses browsers. And it won’t show up right in searches, so people are less likely to find the page.

The body

Here’s Figure 1 again.

<!DOCTYPE HTML PUBLIC  "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <title>TITLE</title>
  </head>
  <body>
    BODY
  </body>
</html>

Figure 1 (again). Standard page template

The actual content of the page is in the <body> section, from lines 7 to 9. This is where the real action happens.

There isn’t anything there at the moment. Let’s add some stuff in the next lesson.

Exercise: Save the template

It helps to have a copy of the bare-bones template lying around. When you create a new page, you start off with the markup in the template.

Copy the code in Figure 1 into your own file. Save it as template.html on your local computer, and upload it to your hosting account as well. Put it somewhere where you can easily find it, like in the Web root.

Enter the URL of the file below.

(Log in to enter your solution to this exercise.)

Summary

So far, you’ve:

  • Learned about the structure of a Web page.
  • Learned about character sets.
  • Learned that Webers keep work processes in mind.
  • Learned why it’s important to get the title right.

What now?

Let’s start adding some HTML tags to the body.


Lessons

User login


Dogs