CSE103

Introduction to Computers and Information Technologies, Fall 2001
W 10:30 - 11:25 Earth & Space 081

Herbert J. Bernstein

This page is: http://www.cs.sunysb.edu/~cse103/Fall_01/CSE103_Protocols.html
Various copyrights apply. All rights reserved.

Internet Protocols

What are protocols

In order for two autonomous devices on a network to communicate, they need to adhere to rules which allow them to cooperate in the process of exchanging data. We call the rules that such devices follow a "protocol". We will look at

Telnet -- a protocol for terminal emulation
FTP -- a protocol for the transfer of files
SSH and SCP -- secure terminal emulation and file transfer protocols
email -- various protocols for the transfer of electronic mail
http -- the hypertext transfer protocol used for web pages

In addition, some protocols require the use of a particular language for the information being sent. For the internet the most important languages are:

HTML -- Hypertext Markup Language
XML -- Extensible Markup Language
XHTML -- Extensible Hypertext Markup Language

Most internet protocols are published as Internet Engineering Task Force (IETF http://www.ietf.org/) documents called "Request for Comments" (RFC http://www.ietf.org/rfc.html).

Telnet

The Telnet protocol is specified by J. Postel, J. Reynolds, RFC 854, "TELNET PROTOCOL SPECIFICATION", May 1983 (http://www.ietf.org/rfc/rfc0854.txt), from which we quote:

"The purpose of the TELNET Protocol is to provide a fairly general, bi-directional, eight-bit byte oriented communications facility. Its primary goal is to allow a standard method of interfacing terminal devices and terminal-oriented processes to each other. It is envisioned that the protocol may also be used for terminal-terminal communication ("linking") and process-process communication (distributed computation)."

The telnet protocol allows a person typing on a keyboard attached to one, local, computer to act as if he were typing on a keyboard attached to another, remote, computer, and to see characters in reply on the local computer which come from the remote computer.

The basic concepts of moving sets of characters from one computer to another as if they were directly connected by a wire are at the heart of most of the other protocols.

FTP

The FTP protocol is specified by J. Postel, J. Reynolds, RFC 959, "TELNET PROTOCOL SPECIFICATION", October 1985 (http://www.ietf.org/rfc/rfc0959.txt), from which we quote:

"The objectives of FTP are 1) to promote sharing of files (computer programs and/or data), 2) to encourage indirect or implicit (via programs) use of remote computers, 3) to shield a user from variations in file storage systems among hosts, and 4) to transfer data reliably and efficiently. FTP, though usable directly by a user at a terminal, is designed mainly for use by programs."

For many years, FTP was, as intended, the primary file-sharing protocol on the internet. Many sites posted interesting files in anonymous FTP account to which any user on the internet might connect. This protocol is still heavily used, often from within web pages.

SSH and SCP

SSH and SCP are encrypted terminal emulation and file transfer protocols. They differ in technical details from Telnet and FTP because they are descended from slightly different protocols, rlogin (B. Kantor, RFC 1282, "BSD Rlogin", December 1991 (http://www.ietf.org/rfc/rfc1282.txt) ) which provides as Unix-style remote login capability, and rcp, a remote copy protocol built on top of RPC (Sun Microsystems, "RPC: Remote Procedure Call Protocol Specification Version 2", June 1988 (http://www.ietf.org/rfc/rfc1057.txt) which provides a Unix-style remote file copy capability. Due to complex and rapidly evolving interactions between hacker attacks on the internet and legal constraints on the use of encryption, the documentation of internet encryption protocols and the protocols themselves are in flux. One specification can be found in T. Ylonen, T. Kivinen, M. Saarinen, T. Rinne, S. Lehtinen, "SSH Protocol Architecture", draft-ietf-secsh-architecture-09.txt, Network Working Group Internet-Draft, July 20, 2001, http://www.ietf.org/internet-drafts/draft-ietf-secsh-architecture-09.txt

On many unix systems, the command to initiate an ssh terminal session is

ssh host -l username

and the command to copy a file from one system to another is

ssh usersrc@hostsrc:filesrc userdst@hostdst:filedst

On many systems, GUI interfaces are provided.

email Protocols

Electronic mail consists of files of information to be routed to particular users on oarticular computers. Some of the information serves as a virtual envelope, prividing address information about the origin and the destination. The rest of the information is the data to be sent. In early days, when computers were up and on networks intermittently, mail had to be sent in a manner similar to a combination of stagecoaches and the old Pony Express, in delayed stages handed off from machine to machine and held as long as necessary on intermediate machines until an outgoing destination was available. The protocol used for that stage by stage transfer was called UUCP. In modern times, messages can be sent quickly, and often without the need for intermediate storage. The major protocols are SMTP, POP, IMAP and MIME. SMTP provides the mechanism to send and receive mail when there is full access to the internet. POP and IMAP are used by machines with intermittent access, such as desktop PCs which may be turned off at night. MIME is an encoding used to carry images and other binary information reliably within email messages.

UUCP: The UUCP protocol is specified by Mark J. Horton, RFC 976, "UUCP Mail Interchange Format Standard", February 1986 (http://www.ietf.org/rfc/rfc0976.txt). The actual format of email messages is defined by David H. Crocker, RFC 822, "STANDARD FOR THE FORMAT OF ARPA INTERNET TEXT MESSAGES", August 13, 1982 (http://www.ietf.org/rfc/rfc0822.txt). Actual stage by stage mail transfers could be by any of many protocols.

SMTP: The SMTP protocol is specified by Jonathan B. Postel, RFC 821, "SIMPLE MAIL TRANSFER PROTOCOL", August 1982 (http://www.ietf.org/rfc/rfc0821.txt), from which we quote:

"The SMTP design is based on the following model of communication: as the result of a user mail request, the sender-SMTP establishes a two-way transmission channel to a receiver-SMTP. The receiver-SMTP may be either the ultimate destination or an intermediate. SMTP commands are generated by the sender-SMTP and sent to the receiver-SMTP. SMTP replies are sent from the receiver-SMTP to the sender-SMTP in response to the commands.

"Once the transmission channel is established, the SMTP-sender sends a MAIL command indicating the sender of the mail. If the SMTP-receiver can accept mail it responds with an OK reply. The SMTP-sender then sends a RCPT command identifying a recipient of the mail. If the SMTP-receiver can accept mail for that recipient it responds with an OK reply; if not, it responds with a reply rejecting that recipient (but not the whole mail transaction). The SMTP-sender and SMTP-receiver may negotiate several recipients. When the recipients have been negotiated the SMTP-sender sends the mail data, terminating with a special sequence. If the SMTP-receiver successfully processes the mail data it responds with an OK reply. The dialog is purposely lock-step, one-at-a-time."

POP:The POP protocol is specified by J. Myers, M. Rose, RFC 1939, "Post Office Protocol - Version 3", May 1996 (http://www.ietf.org/rfc/rfc1939.txt), from which we quote:

"On certain types of smaller nodes in the Internet it is often impractical to maintain a message transport system (MTS). For example, a workstation may not have sufficient resources (cycles, disk space) in order to permit a SMTP server [RFC821] and associated local mail delivery system to be kept resident and continuously running. Similarly, it may be expensive (or impossible) to keep a personal computer interconnected to an IP-style network for long amounts of time (the node is lacking the resource known as "connectivity").

"Despite this, it is often very useful to be able to manage mail on these smaller nodes, and they often support a user agent (UA) to aid the tasks of mail handling. To solve this problem, a node which can support an MTS entity offers a maildrop service to these less endowed nodes. The Post Office Protocol - Version 3 (POP3) is intended to permit a workstation to dynamically access a maildrop on a server host in a useful fashion. Usually, this means that the POP3 protocol is used to allow a workstation to retrieve mail that the server is holding for it.

"POP3 is not intended to provide extensive manipulation operations of mail on the server; normally, mail is downloaded and then deleted. A more advanced (and complex) protocol, IMAP4, is discussed in [RFC1730]."

IMAP:The IMAP protocol is specified by M. Crispin, RFC 1730, "INTERNET MESSAGE ACCESS PROTOCOL - VERSION 4", December 1994 (http://www.ietf.org/rfc/rfc1730.txt), from which we quote:

"The Internet Message Access Protocol, Version 4 (IMAP4) allows a client to access and manipulate electronic mail messages on a server. IMAP4 permits manipulation of remote message folders, called "mailboxes", in a way that is functionally equivalent to local mailboxes. IMAP4 also provides the capability for an offline client to resynchronize with the server (see also [IMAP-DISC]).

"IMAP4 includes operations for creating, deleting, and renaming mailboxes; checking for new messages; permanently removing messages; setting and clearing flags; RFC 822 and MIME parsing; searching; and selective fetching of message attributes, texts, and portions thereof."

MIME:The Multipurpose Internet Mail Extensions format is used to allow email messages to carry images, word processing documents, executable programs, and other binary data. There are several specification documents:

N. Freed, N. Borenstein, RFC 2045, "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", November 1996 (http://www.ietf.org/rfc/rfc2045.txt),

N. Freed, N. Borenstein, RFC 2046, "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", November 1996 (http://www.ietf.org/rfc/rfc2046.txt),

K. Moore, RFC 2047, "Multipurpose Internet Mail Extensions (MIME) Part Three: Message Header Extensions for Non-ASCII Text", November 1996 (http://www.ietf.org/rfc/rfc2047.txt),

N. Freed, J. Klensin, J. Postel RFC 2048, "Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures", November 1996 (http://www.ietf.org/rfc/rfc2048.txt),

N. Freed, N. Borenstein, RFC 2049, "Multipurpose Internet Mail Extensions (MIME) Part Five: Conformance Criteria and Examples", November 1996 (http://www.ietf.org/rfc/rfc2049.txt).

We quote from the last of these documents:

"STD 11, RFC 822, defines a message representation protocol specifying considerable detail about US-ASCII message headers, and leaves the message content, or message body, as flat US-ASCII text. This set of documents, collectively called the Multipurpose Internet Mail Extensions, or MIME, redefines the format of messages to allow for

"(1) textual message bodies in character sets other than US-ASCII,

"(2) an extensible set of different formats for non-textual message bodies,

"(3) multi-part message bodies, and

"(4) textual header information in character sets other than US-ASCII.

"These documents are based on earlier work documented in RFC 934, STD 11, and RFC 1049, but extends and revises them. Because RFC 822 said so little about message bodies, these documents are largely orthogonal to (rather than a revision of) RFC 822.

"The initial document in this set, RFC 2045, specifies the various headers used to describe the structure of MIME messages. The second document defines the general structure of the MIME media typing system and defines an initial set of media types. The third document, RFC 2047, describes extensions to RFC 822 to allow non-US- ASCII text data in Internet mail header fields. The fourth document, RFC 2048, specifies various IANA registration procedures for MIME- related facilities. This fifth and final document describes MIME conformance criteria as well as providing some illustrative examples of MIME message formats, acknowledgements, and the bibliography."

MIME is important not only for email, also for the Hyptertext Transfer Protocol, since it is used to allow binary information to be transfered in that context as well.

HTTP

The HTTP protocol is specified by R. Fielding, J. Gettys, J. Mogul, H. Frystyk, T. Berners-Lee, RFC 2068, "Hypertext Transfer Protocol -- HTTP/1.1", (http://www.ietf.org/rfc/rfc2068.txt), from which we quote:

"The Hypertext Transfer Protocol (HTTP) is an application-level protocol for distributed, collaborative, hypermedia information systems. It is a generic, stateless, object-oriented protocol which can be used for many tasks, such as name servers and distributed object management systems, through extension of its request methods. A feature of HTTP is the typing and negotiation of data representation, allowing systems to be built independently of the data being transferred.

"HTTP has been in use by the World-Wide Web global information initiative since 1990. This specification defines the protocol referred to as "HTTP/1.1"."

The language used to write documents for transfer via the HTTP protocol is called HTML, which we discuss elsewhere in these notes. HTTP is being combined with a general document markup language called XML (Extensible Markup Language) to form XHTML. For more on these langauges, see the web site of the World Wide Web Consortium at http://www.w3.org.

Prepared by Herbert J. Bernstein yaya@cs.sunysb.edu, 2 October 2001.