123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982983984985986987988989990991992993994995996997998999100010011002100310041005100610071008100910101011101210131014 |
- <?xml version="1.0" encoding="UTF-8"?>
- <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
- "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
- <section id="sip_intro" xmlns:xi="http://www.w3.org/2001/XInclude">
- <sectioninfo>
- <authorgroup>
- <author>
- <firstname>Jan</firstname>
- <surname>Janak</surname>
- <email>[email protected]</email>
- </author>
- </authorgroup>
- <copyright>
- <year>2003</year>
- <holder>FhG FOKUS</holder>
- </copyright>
- <abstract>
- <para>
- A brief overview of SIP describing all important aspects of the Session Initiation
- Protocol.
- </para>
- </abstract>
- </sectioninfo>
- <title>SIP Introduction</title>
- <section id="purpose">
- <title>Purpose of SIP</title>
- <simpara>
- SIP stands for Session Initiation Protocol. It is an application-layer control
- protocol which has been developed and designed within the IETF. The protocol has
- been designed with easy implementation, good scalability, and flexibility in mind.
- </simpara>
- <simpara>
- The specification is available in form of several <abbrev>RFCs</abbrev>, the most
- important one is RFC3261 which contains the core protocol specification. The
- protocol is used for creating, modifying, and terminating sessions with one or more
- participants. By sessions we understand a set of senders and receivers that
- communicate and the state kept in those senders and receivers during the
- communication. Examples of a session can include Internet telephone calls,
- distribution of multimedia, multimedia conferences, distributed computer games, etc.
- </simpara>
- <simpara>
- SIP is not the only protocol that the communicating devices will need. It is not
- meant to be a general purpose protocol. Purpose of SIP is just to make the
- communication possible, the communication itself must be achieved by another means
- (and possibly another protocol). Two protocols that are most often used along with
- SIP are RTP and SDP. RTP protocol is used to carry the real-time multimedia
- data (including audio, video, and text), the protocol makes it possible to encode
- and split the data into packets and transport such packets over the
- Internet. Another important protocol is SDP, which is used to describe and encode
- capabilities of session participants. Such a description is then used to negotiate
- the characteristics of the session so that all the devices can participate (that
- includes, for example, negotiation of codecs used to encode media so all the
- participants will be able to decode it, negotiation of transport protocol used and
- so on).
- </simpara>
- <simpara>
- SIP has been designed in conformance with the Internet model. It is an end-to-end
- oriented signaling protocol which means, that all the logic is stored in end
- devices (except routing of SIP messages). State is also stored in end-devices
- only, there is no single point of failure and networks designed this way scale
- well. The price that we have to pay for the distributiveness and scalability is
- higher message overhead, caused by the messages being sent end-to-end.
- </simpara>
- <simpara>
- It is worth of mentioning that the end-to-end concept of SIP is a significant
- divergence from regular PSTN (Public Switched Telephone Network) where all the
- state and logic is stored in the network and end devices (telephones) are very
- primitive. Aim of SIP is to provide the same functionality that the traditional
- PSTNs have, but the end-to-end design makes SIP networks much more powerful and
- open to the implementation of new services that can be hardly implemented in the
- traditional PSTNs.
- </simpara>
- <simpara>
- SIP is based on HTTP protocol. The HTTP protocol inherited format of message
- headers from RFC822. HTTP is probably the most successful and widely used
- protocol in the Internet. It tries to combine the best of the both. In fact, HTTP
- can be classified as a signaling protocol too, because user agents use the protocol
- to tell a HTTP server in which documents they are interested in. SIP is used to
- carry the description of session parameters, the description is encoded into a
- document using SDP. Both protocols (HTTP and SIP) have inherited encoding of
- message headers from RFC822. The encoding has proven to be robust and flexible
- over the years.
- </simpara>
- </section>
- <section id="sip_uri">
- <title>SIP URI</title>
- <simpara>
- SIP entities are identified using SIP URI (Uniform Resource Identifier). A
- SIP URI has form of sip:username@domain, for instance,
- sip:[email protected]. As we can see, SIP URI consists of username part and
- domain name part delimited by @ (at) character. SIP URIs are similar to
- e-mail addresses, it is, for instance, possible to use the same URI for e-mail
- and SIP communication, such URIs are easy to remember.
- </simpara>
- </section>
- <section id="sip_network_elements">
- <title>SIP Network Elements</title>
- <simpara>
- Although in the simplest configuration it is possible to use just two user agents
- that send SIP messages directly to each other, a typical SIP network will
- contain more than one type of SIP elements. Basic SIP elements are user agents,
- proxies, registrars, and redirect servers. We will briefly describe them in this
- section.
- </simpara>
- <simpara>
- Note that the elements, as presented in this section, are often only logical
- entities. It is often profitable to co-locate them together, for instance, to
- increase the speed of processing, but that depends on a particular implementation
- and configuration.
- </simpara>
- <section id="user_agents">
- <title>User Agents</title>
- <simpara>
- Internet end points that use SIP to find each other and to negotiate a session
- characteristics are called <emphasis>user agents</emphasis>. User agents
- usually, but not necessarily, reside on a user's computer in form of an
- application--this is currently the most widely used approach, but user agents
- can be also cellular phones, PSTN gateways, <acronym>PDAs</acronym>, automated
- <acronym>IVR</acronym> systems and so on.
- </simpara>
- <simpara>
- User agents are often referred to as <emphasis>User Agent Server</emphasis>
- (UAS) and <emphasis>User Agent Client</emphasis> (UAC). UAS and UAC are
- logical entities only, each user agent contains a UAC and UAS. UAC is the
- part of the user agent that sends requests and receives responses. UAS is the
- part of the user agent that receives requests and sends responses.
- </simpara>
- <simpara>
- Because a user agent contains both UAC and UAS, we often say that a user
- agent behaves like a UAC or UAS. For instance, caller's user agent behaves
- like UAC when it sends an INVITE requests and receives responses to the
- request. Callee's user agent behaves like a UAS when it receives the INVITE
- and sends responses.
- </simpara>
- <simpara>
- But this situation changes when the callee decides to send a BYE and terminate
- the session. In this case the callee's user agent (sending BYE) behaves like
- UAC and the caller's user agent behaves like UAS.
- </simpara>
- <figure id="uac_and_uas">
- <title>UAC and UAS</title>
- <mediaobject>
- <imageobject>
- <imagedata fileref="figures/ua.png" format="PNG"/>
- </imageobject>
- <textobject>
- <phrase>Picture showing UAC and UAS</phrase>
- </textobject>
- </mediaobject>
- </figure>
- <simpara>
- <xref linkend="uac_and_uas"/> shows three user agents and one stateful forking
- proxy. Each user agent contains UAC and UAS. The part of the proxy that
- receives the INVITE from the caller in fact acts as a UAS. When forwarding the
- request statefully the proxy creates two UACs, each of them is responsible for
- one branch.
- </simpara>
- <simpara>
- In our example callee B picked up and later when he wants to tear down the call
- it sends a BYE. At this time the user agent that was previously UAS becomes a
- UAC and vice versa.
- </simpara>
- </section>
- <section id="proxy_servers">
- <title>Proxy Servers</title>
- <simpara>
- In addition to that SIP allows creation of an infrastructure of network hosts
- called <emphasis>proxy servers</emphasis>. User agents can send messages to a
- proxy server. Proxy servers are very important entities in the SIP
- infrastructure. They perform routing of a session invitations according to
- invitee's current location, authentication, accounting and many other important
- functions.
- </simpara>
- <simpara>
- The most important task of a proxy server is to route session invitations
- "closer" to callee. The session invitation will usually traverse a
- set of proxies until it finds one which knows the actual location of the
- callee. Such a proxy will forward the session invitation directly to the callee
- and the callee will then accept or decline the session invitation.
- </simpara>
- <simpara>
- There are two basic types of SIP proxy servers--stateless and stateful.
- </simpara>
- <section id="stateless_servers">
- <title>Stateless Servers</title>
- <simpara>
- Stateless server are simple message forwarders. They forward messages
- independently of each other. Although messages are usually arranged into
- transactions (see <xref linkend="sip_transactions"/>), stateless proxies
- do not take care of transactions.
- </simpara>
- <simpara>
- Stateless proxies are simple, but faster than stateful proxy servers. They
- can be used as simple load balancers, message translators and routers. One
- of drawbacks of stateless proxies is that they are unable to absorb
- retransmissions of messages and perform more advanced routing, for instance,
- forking or recursive traversal.
- </simpara>
- </section>
- <section id="stateful_servers">
- <title>Stateful Servers</title>
- <simpara>
- Stateful proxies are more complex. Upon reception of a request, stateful
- proxies create a state and keep the state until the transaction
- finishes. Some transactions, especially those created by INVITE, can last
- quite long (until callee picks up or declines the call). Because stateful
- proxies must maintain the state for the duration of the transactions, their
- performance is limited.
- </simpara>
- <simpara>
- The ability to associate SIP messages into transactions gives stateful
- proxies some interesting features. Stateful proxies can perform forking,
- that means upon reception of a message two or more messages will be sent
- out.
- </simpara>
- <simpara>
- Stateful proxies can absorb retransmissions because they know, from the
- transaction state, if they have already received the same message (stateless
- proxies cannot do the check because they keep no state).
- </simpara>
- <simpara>
- Stateful proxies can perform more complicated methods of finding a user. It
- is, for instance, possible to try to reach user's office phone and when he
- doesn't pick up then the call is redirected to his cell phone. Stateless
- proxies can't do this because they have no way of knowing how the
- transaction targeted to the office phone finished.
- </simpara>
- <simpara>
- Most SIP proxies today are stateful because their configuration is usually
- very complex. They often perform accounting, forking, some sort of NAT
- traversal aid and all those features require a stateful proxy.
- </simpara>
- </section>
- <section id="proxy_server_usage">
- <title>Proxy Server Usage</title>
- <simpara>
- A typical configuration is that each centrally administered entity (a
- company, for instance) has it's own SIP proxy server which is used by all
- user agents in the entity. Let's suppose that there are two companies A and
- B and each of them has it's own proxy server. <xref linkend="companies"/>
- shows how a session invitation from employee Joe in company A will reach
- employee Bob in company B.
- </simpara>
- <figure id="companies">
- <title>Session Invitation</title>
- <mediaobject>
- <imageobject>
- <imagedata fileref="figures/companies.png" format="PNG"/>
- </imageobject>
- <textobject>
- <phrase>Picture showing a session invitation message flow</phrase>
- </textobject>
- </mediaobject>
- </figure>
- <simpara>
- User Joe uses address sip:[email protected] to call Bob. Joe's user agent doesn't
- know how to route the invitation itself but it is configured to send all
- outbound traffic to the company SIP proxy server proxy.a.com. The proxy
- server figures out that user sip:[email protected] is in a different company so it
- will look up B's SIP proxy server and send the invitation there. B's proxy
- server can be either pre-configured at proxy.a.com or the proxy will use
- <acronym>DNS SRV</acronym> records to find B's proxy server. The invitation
- reaches proxy.bo.com. The proxy knows that Bob is currently sitting in his
- office and is reachable through phone on his desk, which has IP address
- 1.2.3.4, so the proxy will send the invitation there.
- </simpara>
- </section>
- </section>
- <section id="sip_intro.registrar">
- <title>Registrar</title>
- <simpara>
- We mentioned that the SIP proxy at proxy.b.com knows current Bob's location
- but haven't mentioned yet how a proxy can learn current location of a
- user. Bob's user agent (SIP phone) must register with a
- <emphasis>registrar</emphasis>. The registrar is a special SIP entity that
- receives registrations from users, extracts information about their current
- location (IP address, port and username in this case) and stores the
- information into location database. Purpose of the location database is to map
- sip:[email protected] to something like sip:[email protected]:5060. The location database is
- then used by B's proxy server. When the proxy receives an invitation for
- sip:[email protected] it will search the location database. It finds
- sip:[email protected]:5060 and will send the invitation there. A registrar is very
- often a logical entity only. Because of their tight coupling with proxies
- registrars, are usually co-located with proxy servers.
- </simpara>
- <simpara>
- <xref linkend="registrar_fig"/> shows a typical SIP registration. A REGISTER
- message containing Address of Record sip:[email protected] and contact address
- sip:[email protected]:5060 where 1.2.3.4 is IP address of the phone, is sent to the
- registrar. The registrar extracts this information and stores it into the
- location database. If everything went well then the registrar sends a 200 OK
- response to the phone and the process of registration is finished.
- </simpara>
- <figure id="registrar_fig">
- <title>Registrar Overview</title>
- <mediaobject>
- <imageobject>
- <imagedata fileref="figures/registrar.png" format="PNG"/>
- </imageobject>
- <textobject>
- <phrase>Picture showing a typical registrar</phrase>
- </textobject>
- </mediaobject>
- </figure>
- <simpara>
- Each registration has a limited lifespan. Expires header field or expires
- parameter of Contact header field determines for how long is the registration
- valid. The user agent must refresh the registration within the lifespan
- otherwise it will expire and the user will become unavailable.
- </simpara>
- </section>
- <section id="redirect_server">
- <title>Redirect Server</title>
- <simpara>
- The entity that receives a request and sends back a reply containing a list of the
- current location of a particular user is called <emphasis>redirect server</emphasis>. A
- redirect server receives requests and looks up the intended recipient of the request in
- the location database created by a registrar. It then creates a list of current
- locations of the user and sends it to the request originator in a response within 3xx
- class.
- </simpara>
- <simpara>
- The originator of the request then extracts the list of destinations and sends
- another request directly to them. <xref linkend="redirect"/> shows a typical
- redirection.
- </simpara>
- <figure id="redirect">
- <title>SIP Redirection</title>
- <mediaobject>
- <imageobject>
- <imagedata fileref="figures/redirect.png" format="PNG"/>
- </imageobject>
- <textobject>
- <phrase>Picture showing a redirection</phrase>
- </textobject>
- </mediaobject>
- </figure>
- </section>
- </section>
- <section id="sip_messages">
- <title>SIP Messages</title>
- <simpara>
- Communication using SIP (often called signaling) comprises of series of
- <emphasis>messages</emphasis>. Messages can be transported independently by the
- network. Usually they are transported in a separate UDP datagram each. Each
- message consist of "first line", message header, and message body. The
- first line identifies type of the message. There are two types of
- messages--<emphasis>requests</emphasis> and <emphasis>responses</emphasis>.
- Requests are usually used to initiate some action or inform recipient of the request
- of something. Replies are used to confirm that a request was received and processed
- and contain the status of the processing.
- </simpara>
- <simpara>
- A typical SIP request looks like this:
- </simpara>
- <programlisting>
- <![CDATA[
- INVITE sip:[email protected] SIP/2.0
- Via: SIP/2.0/UDP 195.37.77.100:5040;rport
- Max-Forwards: 10
- From: "jiri" <sip:[email protected]>;tag=76ff7a07-c091-4192-84a0-d56e91fe104f
- To: <sip:[email protected]>
- Call-ID: [email protected]
- CSeq: 2 INVITE
- Contact: <sip:213.20.128.35:9315>
- User-Agent: Windows RTC/1.0
- Proxy-Authorization: Digest username="jiri", realm="iptel.org",
- algorithm="MD5", uri="sip:[email protected]",
- nonce="3cef753900000001771328f5ae1b8b7f0d742da1feb5753c",
- response="53fe98db10e1074
- b03b3e06438bda70f"
- Content-Type: application/sdp
- Content-Length: 451
- v=0
- o=jku2 0 0 IN IP4 213.20.128.35
- s=session
- c=IN IP4 213.20.128.35
- b=CT:1000
- t=0 0
- m=audio 54742 RTP/AVP 97 111 112 6 0 8 4 5 3 101
- a=rtpmap:97 red/8000
- a=rtpmap:111 SIREN/16000
- a=fmtp:111 bitrate=16000
- a=rtpmap:112 G7221/16000
- a=fmtp:112 bitrate=24000
- a=rtpmap:6 DVI4/16000
- a=rtpmap:0 PCMU/8000
- a=rtpmap:4 G723/8000
- a=rtpmap: 3 GSM/8000
- a=rtpmap:101 telephone-event/8000
- a=fmtp:101 0-16
- ]]>
- </programlisting>
- <simpara>
- The first line tells us that this is INVITE message which is used to establish a
- session. The URI on the first line--sip:[email protected] is called <emphasis>Request
- URI</emphasis> and contains URI of the next hop of the message. In this case it
- will be host iptel.org.
- </simpara>
- <simpara>
- A SIP request can contain one or more Via header fields which are used to record
- path of the request. They are later used to route SIP responses exactly the same
- way. The INVITE message contains just one Via header field which was created by the
- user agent that sent the request. From the Via field we can tell that the user agent
- is running on host 195.37.77.100 and port 5060.
- </simpara>
- <simpara>
- From and To header fields identify initiator (caller) and recipient (callee) of the
- invitation (just like in SMTP where they identify sender and recipient of a
- message). From header field contains a tag parameter which serves as a dialog
- identifier and will be described in <xref linkend="sip_dialogs"/>.
- </simpara>
- <simpara>
- Call-ID header field is a dialog identifier and it's purpose is to identify messages
- belonging to the same call. Such messages have the same Call-ID identifier. CSeq is
- used to maintain order of requests. Because requests can be sent over an unreliable
- transport that can re-order messages, a sequence number must be present in the
- messages so that recipient can identify retransmissions and out of order requests.
- </simpara>
- <simpara>
- Contact header field contains IP address and port on which the sender is awaiting
- further requests sent by callee. Other header fields are not important and will be
- not described here.
- </simpara>
- <simpara>
- Message header is delimited from message body by an empty line. Message body of the INVITE
- request contains a description of the media type accepted by the sender and encoded in
- SDP.
- </simpara>
- <section id="sip_requests">
- <title>SIP Requests</title>
- <simpara>
- We have described how an INVITE request looks like and said that the request is
- used to invite a callee to a session. Other important requests are:
- </simpara>
- <itemizedlist>
- <listitem>
- <simpara>
- <emphasis>ACK</emphasis>--This message acknowledges receipt of a final
- response to INVITE. Establishing of a session utilizes 3-way
- hand-shaking due to asymmetric nature of the invitation. It may take a
- while before the callee accepts or declines the call so the callee's
- user agent periodically retransmits a positive final response until it
- receives an ACK (which indicates that the caller is still there and
- ready to communicate).
- </simpara>
- </listitem>
- <listitem>
- <simpara>
- <emphasis>BYE</emphasis>--Bye messages are used to tear down multimedia
- sessions. A party wishing to tear down a session sends a BYE to the
- other party.
- </simpara>
- </listitem>
- <listitem>
- <simpara>
- <emphasis>CANCEL</emphasis>--Cancel is used to cancel not yet fully
- established session. It is used when the callee hasn't replied with a
- final response yet but the caller wants to abort the call (typically
- when a callee doesn't respond for some time).
- </simpara>
- </listitem>
- <listitem>
- <simpara>
- <emphasis>REGISTER</emphasis>--Purpose of REGISTER request is to let
- registrar know of current user's location. Information about current
- IP address and port on which a user can be reached is carried in
- REGISTER messages. Registrar extracts this information and puts it into
- a location database. The database can be later used by SIP proxy
- servers to route calls to the user. Registrations are time-limited and
- need to be periodically refreshed.
- </simpara>
- </listitem>
- </itemizedlist>
- <simpara>
- The listed requests usually have no message body because it is not needed in
- most situations (but can have one). In addition to that many other request types
- have been defined but their description is out of the scope of this document.
- </simpara>
- </section>
- <section id="sip_responses">
- <title>SIP Responses</title>
- <simpara>
- When a user agent or proxy server receives a request it send a reply. Each
- request must be replied except ACK requests which trigger no replies.
- </simpara>
- <simpara>
- A typical reply looks like this:
- </simpara>
- <programlisting>
- <![CDATA[
- SIP/2.0 200 OK
- Via: SIP/2.0/UDP 192.168.1.30:5060;received=66.87.48.68
- From: sip:[email protected]
- To: sip:[email protected];tag=794fe65c16edfdf45da4fc39a5d2867c.b713
- Call-ID: [email protected]
- CSeq: 63629 REGISTER
- Contact: Msip:[email protected]:5060;transport=udp>;q=0.00;expires=120
- Server: Sip EXpress router (0.8.11pre21xrc (i386/linux))
- Content-Length: 0
- Warning: 392 195.37.77.101:5060 "Noisy feedback tells:
- pid=5110 req_src_ip=66.87.48.68 req_src_port=5060 in_uri=sip:iptel.org
- out_uri=sip:iptel.org via_cnt==1"
- ]]>
- </programlisting>
- <simpara>
- As we can see, responses are very similar to the requests, except for the first
- line. The first line of response contains protocol version (SIP/2.0), reply
- code, and reason phrase.
- </simpara>
- <simpara>
- The <emphasis>reply code</emphasis> is an integer number from 100 to 699 and
- indicates type of the response. There are 6 classes of responses:
- </simpara>
- <itemizedlist>
- <listitem>
- <simpara>
- <emphasis>1xx</emphasis> are <emphasis>provisional</emphasis>
- responses. A provisional response is response that tells to its
- recipient that the associated request was received but result of the
- processing is not known yet. Provisional responses are sent only when
- the processing doesn't finish immediately. The sender must stop
- retransmitting the request upon reception of a provisional response.
- </simpara>
- <simpara>
- Typically proxy servers send responses with code 100 when they start
- processing an INVITE and user agents send responses with code 180
- (Ringing) which means that the callee's phone is ringing.
- </simpara>
- </listitem>
- <listitem>
- <simpara>
- <emphasis>2xx</emphasis> responses are <emphasis>positive
- final</emphasis> responses. A final response is the ultimate response
- that the originator of the request will ever receive. Therefore final
- responses express result of the processing of the associated
- request. Final responses also terminate transactions. Responses with
- code from 200 to 299 are positive responses that means that the request
- was processed successfully and accepted. For instance a 200 OK response
- is sent when a user accepts invitation to a session (INVITE request).
- </simpara>
- <simpara>
- A UAC may receive several 200 messages to a single INVITE
- request. This is because a forking proxy (described later) can fork the
- request so it will reach several UAS and each of them will accept the
- invitation. In this case each response is distinguished by the tag
- parameter in To header field. Each response represents a distinct dialog
- with unambiguous dialog identifier.
- </simpara>
- </listitem>
- <listitem>
- <simpara>
- <emphasis>3xx</emphasis> responses are used to redirect a caller. A
- redirection response gives information about the user's new location or
- an alternative service that the caller might use to satisfy the
- call. Redirection responses are usually sent by proxy servers. When a
- proxy receives a request and doesn't want or can't process it for any
- reason, it will send a redirection response to the caller and put
- another location into the response which the caller might want to
- try. It can be the location of another proxy or the current location of
- the callee (from the location database created by a registrar). The
- caller is then supposed to re-send the request to the new location. 3xx
- responses are final.
- </simpara>
- </listitem>
- <listitem>
- <simpara>
- <emphasis>4xx</emphasis> are <emphasis>negative final</emphasis>
- responses. a 4xx response means that the problem is on the sender's
- side. The request couldn't be processed because it contains bad syntax
- or cannot be fulfilled at that server.
- </simpara>
- </listitem>
- <listitem>
- <simpara>
- <emphasis>5xx</emphasis> means that the problem is on server's side. The
- request is apparently valid but the server failed to fulfill it. Clients
- should usually retry the request later.
- </simpara>
- </listitem>
- <listitem>
- <simpara>
- <emphasis>6xx</emphasis> reply code means that the request cannot be
- fulfilled at any server. This response is usually sent by a server that
- has definitive information about a particular user. User agents usually
- send a 603 Decline response when the user doesn't want to participate in
- the session.
- </simpara>
- </listitem>
- </itemizedlist>
- <simpara>
- In addition to the response class the first line also contains <emphasis>reason
- phrase</emphasis>. The code number is intended to be processed by
- machines. It is not very human-friendly but it is very easy to parse and
- understand by machines. The reason phrase usually contains a human-readable
- message describing the result of the processing. A user agent should render
- the reason phrase to the user.
- </simpara>
- <simpara>
- The request to which a particular response belongs is identified using the CSeq
- header field. In addition to the sequence number this header field also contains
- method of corresponding request. In our example it was REGISTER request.
- </simpara>
- </section>
- </section>
- <section id="sip_transactions">
- <title>SIP Transactions</title>
- <simpara>
- Although we said that SIP messages are sent independently over the network, they
- are usually arranged into <emphasis>transactions</emphasis> by user agents and
- certain types of proxy servers. Therefore SIP is said to be a
- <emphasis>transactional protocol</emphasis>.
- </simpara>
- <simpara>
- A transaction is a sequence of SIP messages exchanged between SIP network
- elements. A transaction consists of one request and all responses to that
- request. That includes zero or more provisional responses and one or more final
- responses (remember that an INVITE might be answered by more than one final response
- when a proxy server forks the request).
- </simpara>
- <simpara>
- If a transaction was initiated by an INVITE request then the same transaction also
- includes ACK, but only if the final response was not a 2xx response. If the final
- response was a 2xx response then the ACK is not considered part of the transaction.
- </simpara>
- <simpara>
- As we can see this is quite asymmetric behavior--ACK is part of transactions with a
- negative final response but is not part of transactions with positive final
- responses. The reason for this separation is the importance of delivery of all 200
- OK messages. Not only that they establish a session, but also 200 OK can be
- generated by multiple entities when a proxy server forks the request and all of them
- must be delivered to the calling user agent. Therefore user agents take
- responsibility in this case and retransmit 200 OK responses until they receive an
- ACK. Also note that only responses to INVITE are retransmitted !
- </simpara>
- <simpara>
- SIP entities that have notion of transactions are called
- <emphasis>stateful</emphasis>. Such entities usually create a state associated with
- a transaction that is kept in the memory for the duration of the transaction. When a
- request or response comes, a stateful entity tries to associate the request (or
- response) to existing transactions. To be able to do it it must extract a unique
- transaction identifier from the message and compare it to identifiers of all
- existing transactions. If such a transaction exists then it's state gets updated
- from the message.
- </simpara>
- <simpara>
- In the previous SIP RFC2543 the transaction identifier was calculated as hash of
- all important message header fields (that included To, From, Request-URI and
- CSeq). This proved to be very slow and complex, during interoperability tests such
- transaction identifiers used to be a common source of problems.
- </simpara>
- <simpara>
- In the new RFC3261 the way of calculating transaction identifiers was completely
- changed. Instead of complicated hashing of important header fields a SIP message now
- includes the identifier directly. Branch parameter of Via header fields contains directly
- the transaction identifier. This is significant simplification, but there still exist old
- implementations that don't support the new way of calculating of transaction identifier so
- even new implementations have to support the old way. They must be backwards compatible.
- </simpara>
- <simpara>
- <xref linkend="transactions"/> shows what messages belong to what transactions
- during a conversation of two user agents.
- </simpara>
- <figure id="transactions">
- <title>SIP Transactions</title>
- <mediaobject>
- <imageobject>
- <imagedata fileref="figures/transaction.png" format="PNG"/>
- </imageobject>
- <textobject>
- <phrase>Message flow showing messages belonging to the same transaction.</phrase>
- </textobject>
- </mediaobject>
- </figure>
- </section>
- <section id="sip_dialogs">
- <title>SIP Dialogs</title>
- <simpara>
- We have shown what transactions are, that one transaction includes INVITE and it's
- responses and another transaction includes BYE and it responses when a session is
- being torn down. But we feel that those two transactions should be somehow
- related--both of them belong to the same <emphasis>dialog</emphasis>. A dialog
- represents a peer-to-peer SIP relationship between two user agents. A dialog
- persists for some time and it is very important concept for user agents. Dialogs
- facilitate proper sequencing and routing of messages between SIP endpoints.
- </simpara>
- <simpara>
- Dialogs are identified using Call-ID, From tag, and To
- tag. Messages that have these three identifiers same belong to the
- same dialog. We have shown that CSeq header field is used to order
- messages, in fact it is used to order messages within a dialog. The
- number must be monotonically increased for each message sent within
- a dialog otherwise the peer will handle it as out of order request
- or retransmission. In fact, the CSeq number identifies a
- transaction within a dialog because we have said that requests and
- associated responses are called transaction. This means that only
- one transaction in each direction can be active within a
- dialog. One could also say that a <emphasis>dialog is a sequence of
- transactions</emphasis>. <xref linkend="dialog"/> extends <xref
- linkend="transactions"/> to show which messages belong to the
- same dialog.
- </simpara>
- <figure id="dialog">
- <title>SIP Dialog</title>
- <mediaobject>
- <imageobject>
- <imagedata fileref="figures/dialog.png" format="PNG"/>
- </imageobject>
- <textobject>
- <phrase>Message flow showing transactions belonging to the same dialog.</phrase>
- </textobject>
- </mediaobject>
- </figure>
- <simpara>
- Some messages establish a dialog and some do not. This allows to explicitly express
- the relationship of messages and also to send messages that are not related to other
- messages outside a dialog. That is easier to implement because user agent don't have
- to keep the dialog state.
- </simpara>
- <simpara>
- For instance, INVITE message establishes a dialog, because it will be later followed
- by BYE request which will tear down the session established by the INVITE. This BYE
- is sent within the dialog established by the INVITE.
- </simpara>
- <simpara>
- But if a user agent sends a MESSAGE request, such a request doesn't establish any
- dialog. Any subsequent messages (even MESSAGE) will be sent independently of the
- previous one.
- </simpara>
- <section id="dialogs_facilitate_routing">
- <title>Dialogs Facilitate Routing</title>
- <simpara>
- We have said that dialogs are also used to route the messages between user
- agents, let's describe this a little bit.
- </simpara>
- <simpara>
- Let's suppose that user sip:[email protected] wants to talk to user sip:[email protected]. He
- knows SIP address of the callee (sip:[email protected]) but this address doesn't say
- anything about current location of the user--i.e. the caller doesn't know to
- which host to send the request. Therefore the INVITE request will be sent to a
- proxy server.
- </simpara>
- <simpara>
- The request will be sent from proxy to proxy until it reaches one that knows
- current location of the callee. This process is called routing. Once the request
- reaches the callee, the callee's user agent will create a response that will be
- sent back to the caller. Callee's user agent will also put Contact header field
- into the response which will contain the current location of the user. The
- original request also contained Contact header field which means that both user
- agents know the current location of the peer.
- </simpara>
- <simpara>
- Because the user agents know location of each other, it is not necessary to send
- further requests to any proxy--they can be sent directly from user agent to user
- agent. That's exactly how dialogs facilitate routing.
- </simpara>
- <simpara>
- Further messages within a dialog are sent directly from user agent to user
- agent. This is a significant performance improvement because proxies do not see
- all the messages within a dialog, they are used to route just the first request
- that establishes the dialog. The direct messages are also delivered with much
- smaller latency because a typical proxy usually implements complex routing
- logic. <xref linkend="trapezoid"/> contains an example of a message
- within a dialog (BYE) that bypasses the proxies.
- </simpara>
- <figure id="trapezoid">
- <title>SIP Trapezoid</title>
- <mediaobject>
- <imageobject>
- <imagedata fileref="figures/trapezoid.png" format="PNG"/>
- </imageobject>
- <textobject>
- <phrase>Message flow showing SIP trapezoid.</phrase>
- </textobject>
- </mediaobject>
- </figure>
- </section>
- <section id="dialogs_identifiers">
- <title>Dialog Identifiers</title>
- <simpara>
- We have already shown that dialog identifiers consist of three parts, Call-Id,
- From tag, and To tag, but it is not that clear why are dialog identifiers
- created exactly this way and who contributes which part.
- </simpara>
- <simpara>
- Call-ID is so called <emphasis>call identifier</emphasis>. It must be a unique
- string that identifies a call. A call consists of one or more dialogs. Multiple
- user agents may respond to a request when a proxy along the path forks the
- request. Each user agent that sends a 2xx establishes a separate dialog with the
- caller. All such dialogs are part of the same call and have the same Call-ID.
- </simpara>
- <simpara>
- From tag is generated by the caller and it uniquely identifies the dialog in the
- caller's user agent.
- </simpara>
- <simpara>
- To tag is generated by a callee and it uniquely identifies, just like From tag,
- the dialog in the callee's user agent.
- </simpara>
- <simpara>
- This hierarchical dialog identifier is necessary because a single call
- invitation can create several dialogs and caller must be able to distinguish
- them.
- </simpara>
- </section>
- </section>
- <section id="typical_sip_scenarios">
- <title>Typical SIP Scenarios</title>
- <simpara>
- This section gives a brief overview of typical SIP scenarios that usually make up the
- SIP traffic.
- </simpara>
- <section id="registration">
- <title>Registration</title>
- <simpara>
- Users must register themselves with a registrar to be reachable by other
- users. A registration comprises a REGISTER message followed by a 200 OK sent by
- registrar if the registration was successful. Registrations are usually
- authorized so a 407 reply can appear if the user didn't provide valid
- credentials. <xref linkend="register_fig"/> shows an example of registration.
- </simpara>
- <figure id="register_fig">
- <title>REGISTER Message Flow</title>
- <mediaobject>
- <imageobject>
- <imagedata fileref="figures/register.png" format="PNG"/>
- </imageobject>
- <textobject>
- <phrase>Message flow of a registration.</phrase>
- </textobject>
- </mediaobject>
- </figure>
- </section>
- <section id="session_invitation">
- <title>Session Invitation</title>
- <simpara>
- A session invitation consists of one INVITE request which is usually sent to a
- proxy. The proxy sends immediately a 100 Trying reply to stop retransmissions
- and forwards the request further.
- </simpara>
- <simpara>
- All provisional responses generated by callee are sent back to the caller. See
- 180 Ringing response in the call flow. The response is generated when callee's
- phone starts ringing.
- </simpara>
- <figure id="invite1">
- <title>INVITE Message Flow</title>
- <mediaobject>
- <imageobject>
- <imagedata fileref="figures/invite1.png" format="PNG"/>
- </imageobject>
- <textobject>
- <phrase>Picture showing a session invitation.</phrase>
- </textobject>
- </mediaobject>
- </figure>
- <simpara>
- A 200 OK is generated once the callee picks up the phone and it is retransmitted
- by the callee's user agent until it receives an ACK from the caller. The session
- is established at this point.
- </simpara>
- </section>
- <section id="session_termination">
- <title>Session Termination</title>
- <simpara>
- Session termination is accomplished by sending a BYE request within dialog
- established bye INVITE. BYE messages are sent directly from one user agent to
- the other unless a proxy on the path of the INVITE request indicated that it
- wishes to stay on the path by using record routing (see <xref
- linkend="record_routing"/>.
- </simpara>
- <simpara>
- Party wishing to tear down a session sends a BYE request to the other party
- involved in the session. The other party sends a 200 OK response to confirm the
- BYE and the session is terminated. See <xref linkend="bye"/>, left message
- flow.
- </simpara>
- </section>
- <section id="record_routing">
- <title>Record Routing</title>
- <simpara>
- All requests sent within a dialog are by default sent directly from one user agent
- to the other. Only requests outside a dialog traverse SIP proxies. This approach
- makes SIP network more scalable because only a small number of SIP messages hit
- the proxies.
- </simpara>
- <simpara>
- There are certain situations in which a SIP proxy need to stay on the path of all
- further messages. For instance, proxies controlling a NAT box or proxies doing
- accounting need to stay on the path of BYE requests.
- </simpara>
- <simpara>
- Mechanism by which a proxy can inform user agents that it wishes to stay on the path
- of all further messages is called <emphasis>record routing</emphasis>. Such a proxy
- would insert Record-Route header field into SIP messages which contain address of
- the proxy. Messages sent within a dialog will then traverse all SIP proxies that
- put a Record-Route header field into the message.
- </simpara>
- <simpara>
- The recipient of the request receives a set of Record-Route header fields in the
- message. It must mirror all the Record-Route header fields into responses because
- the originator of the request also needs to know the set of proxies.
- </simpara>
- <figure id="bye">
- <title>BYE Message Flow (With and without Record Routing)</title>
- <mediaobject>
- <imageobject>
- <imagedata fileref="figures/bye.png" format="PNG"/>
- </imageobject>
- <textobject>
- <phrase>Picture showing BYE message flow with and without record routing.</phrase>
- </textobject>
- </mediaobject>
- </figure>
- <simpara>
- Left message flow of <xref linkend="bye"/> show how a BYE (request
- within dialog established by INVITE) is sent directly to the other user agent
- when there is no Record-Route header field in the message. Right message flow
- show how the situation changes when the proxy puts a Record-Route header field
- into the message.
- </simpara>
- <section id="strict_vs_loose">
- <title>Strict versus Loose Routing</title>
- <simpara>
- The way how record routing works has evolved. Record routing according to
- RFC2543 rewrote the Request-URI. That means the Request-URI always
- contained URI of the next hop (which can be either next proxy server which
- inserted Record-Route header field or destination user agent). Because of
- that it was necessary to save the original Request-URI as the last Route
- header field. This approach is called <emphasis>strict routing</emphasis>.
- </simpara>
- <simpara>
- <emphasis>Loose routing</emphasis>, as specified in RFC3261, works in a
- little bit different way. The Request-URI is no more overwritten, it always
- contains URI of the destination user agent. If there are any Route header
- field in a message, than the message is sent to the URI from the topmost
- Route header field. This is significant change--Request-URI doesn't
- necessarily contain URI to which the request will be sent. In fact, loose
- routing is very similar to IP source routing.
- </simpara>
- <simpara>
- Because transit from strict routing to loose routing would break backwards
- compatibility and older user agents wouldn't work, it is necessary to make
- loose routing backwards compatible. The backwards compatibility
- unfortunately adds a lot of overhead and is often source of major problems.
- </simpara>
- </section>
- </section>
- <section id="sub_not">
- <title>Event Subscription And Notification</title>
- <simpara>
- The SIP specification has been extended to support a general mechanism allowing
- subscription to asynchronous events. Such evens can include SIP proxy statistics
- changes, presence information, session changes and so on.
- </simpara>
- <simpara>
- The mechanism is used mainly to convey information on presence (willingness to
- communicate) of users. <xref linkend="event"/> shows the basic message
- flow.
- </simpara>
- <figure id="event">
- <title>Event Subscription And Notification</title>
- <mediaobject>
- <imageobject>
- <imagedata fileref="figures/event.png" format="PNG"/>
- </imageobject>
- <textobject>
- <phrase>Picture showing subscription and notification.</phrase>
- </textobject>
- </mediaobject>
- </figure>
- <simpara>
- A user agent interested in event notification sends a SUBSCRIBE message to a
- SIP server. The SUBSCRIBE message establishes a dialog and is immediately
- replied by the server using 200 OK response. At this point the dialog is
- established. The server sends a NOTIFY request to the user every time the event
- to which the user subscribed changes. NOTIFY messages are sent within the dialog
- established by the SUBSCRIBE.
- </simpara>
- <simpara>
- Note that the first NOTIFY message in <xref linkend="event"/> is sent
- regardless of any event that triggers notifications.
- </simpara>
- <simpara>
- Subscriptions--as well as registrations--have limited lifespan and therefore must be
- periodically refreshed.
- </simpara>
- </section>
- <section id="im">
- <title>Instant Messages</title>
- <simpara>
- Instant messages are sent using MESSAGE request. MESSAGE requests do not establish a
- dialog and therefore they will always traverse the same set of proxies. This is the
- simplest form of sending instant messages. The text of the instant message is
- transported in the body of the SIP request.
- </simpara>
- <figure id="message">
- <title>Instant Messages</title>
- <mediaobject>
- <imageobject>
- <imagedata fileref="figures/message.png" format="PNG"/>
- </imageobject>
- <textobject>
- <phrase>Picture showing a MESSAGE.</phrase>
- </textobject>
- </mediaobject>
- </figure>
- </section>
- </section>
- </section>
|