sip_introduction.xml 46 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982983984985986987988989990991992993994995996997998999100010011002100310041005100610071008100910101011101210131014
  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
  3. "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
  4. <section id="sip_intro" xmlns:xi="http://www.w3.org/2001/XInclude">
  5. <sectioninfo>
  6. <authorgroup>
  7. <author>
  8. <firstname>Jan</firstname>
  9. <surname>Janak</surname>
  10. <email>[email protected]</email>
  11. </author>
  12. </authorgroup>
  13. <copyright>
  14. <year>2003</year>
  15. <holder>FhG FOKUS</holder>
  16. </copyright>
  17. <abstract>
  18. <para>
  19. A brief overview of SIP describing all important aspects of the Session Initiation
  20. Protocol.
  21. </para>
  22. </abstract>
  23. </sectioninfo>
  24. <title>SIP Introduction</title>
  25. <section id="purpose">
  26. <title>Purpose of SIP</title>
  27. <simpara>
  28. SIP stands for Session Initiation Protocol. It is an application-layer control
  29. protocol which has been developed and designed within the IETF. The protocol has
  30. been designed with easy implementation, good scalability, and flexibility in mind.
  31. </simpara>
  32. <simpara>
  33. The specification is available in form of several <abbrev>RFCs</abbrev>, the most
  34. important one is RFC3261 which contains the core protocol specification. The
  35. protocol is used for creating, modifying, and terminating sessions with one or more
  36. participants. By sessions we understand a set of senders and receivers that
  37. communicate and the state kept in those senders and receivers during the
  38. communication. Examples of a session can include Internet telephone calls,
  39. distribution of multimedia, multimedia conferences, distributed computer games, etc.
  40. </simpara>
  41. <simpara>
  42. SIP is not the only protocol that the communicating devices will need. It is not
  43. meant to be a general purpose protocol. Purpose of SIP is just to make the
  44. communication possible, the communication itself must be achieved by another means
  45. (and possibly another protocol). Two protocols that are most often used along with
  46. SIP are RTP and SDP. RTP protocol is used to carry the real-time multimedia
  47. data (including audio, video, and text), the protocol makes it possible to encode
  48. and split the data into packets and transport such packets over the
  49. Internet. Another important protocol is SDP, which is used to describe and encode
  50. capabilities of session participants. Such a description is then used to negotiate
  51. the characteristics of the session so that all the devices can participate (that
  52. includes, for example, negotiation of codecs used to encode media so all the
  53. participants will be able to decode it, negotiation of transport protocol used and
  54. so on).
  55. </simpara>
  56. <simpara>
  57. SIP has been designed in conformance with the Internet model. It is an end-to-end
  58. oriented signaling protocol which means, that all the logic is stored in end
  59. devices (except routing of SIP messages). State is also stored in end-devices
  60. only, there is no single point of failure and networks designed this way scale
  61. well. The price that we have to pay for the distributiveness and scalability is
  62. higher message overhead, caused by the messages being sent end-to-end.
  63. </simpara>
  64. <simpara>
  65. It is worth of mentioning that the end-to-end concept of SIP is a significant
  66. divergence from regular PSTN (Public Switched Telephone Network) where all the
  67. state and logic is stored in the network and end devices (telephones) are very
  68. primitive. Aim of SIP is to provide the same functionality that the traditional
  69. PSTNs have, but the end-to-end design makes SIP networks much more powerful and
  70. open to the implementation of new services that can be hardly implemented in the
  71. traditional PSTNs.
  72. </simpara>
  73. <simpara>
  74. SIP is based on HTTP protocol. The HTTP protocol inherited format of message
  75. headers from RFC822. HTTP is probably the most successful and widely used
  76. protocol in the Internet. It tries to combine the best of the both. In fact, HTTP
  77. can be classified as a signaling protocol too, because user agents use the protocol
  78. to tell a HTTP server in which documents they are interested in. SIP is used to
  79. carry the description of session parameters, the description is encoded into a
  80. document using SDP. Both protocols (HTTP and SIP) have inherited encoding of
  81. message headers from RFC822. The encoding has proven to be robust and flexible
  82. over the years.
  83. </simpara>
  84. </section>
  85. <section id="sip_uri">
  86. <title>SIP URI</title>
  87. <simpara>
  88. SIP entities are identified using SIP URI (Uniform Resource Identifier). A
  89. SIP URI has form of sip:username@domain, for instance,
  90. sip:[email protected]. As we can see, SIP URI consists of username part and
  91. domain name part delimited by @ (at) character. SIP URIs are similar to
  92. e-mail addresses, it is, for instance, possible to use the same URI for e-mail
  93. and SIP communication, such URIs are easy to remember.
  94. </simpara>
  95. </section>
  96. <section id="sip_network_elements">
  97. <title>SIP Network Elements</title>
  98. <simpara>
  99. Although in the simplest configuration it is possible to use just two user agents
  100. that send SIP messages directly to each other, a typical SIP network will
  101. contain more than one type of SIP elements. Basic SIP elements are user agents,
  102. proxies, registrars, and redirect servers. We will briefly describe them in this
  103. section.
  104. </simpara>
  105. <simpara>
  106. Note that the elements, as presented in this section, are often only logical
  107. entities. It is often profitable to co-locate them together, for instance, to
  108. increase the speed of processing, but that depends on a particular implementation
  109. and configuration.
  110. </simpara>
  111. <section id="user_agents">
  112. <title>User Agents</title>
  113. <simpara>
  114. Internet end points that use SIP to find each other and to negotiate a session
  115. characteristics are called <emphasis>user agents</emphasis>. User agents
  116. usually, but not necessarily, reside on a user's computer in form of an
  117. application--this is currently the most widely used approach, but user agents
  118. can be also cellular phones, PSTN gateways, <acronym>PDAs</acronym>, automated
  119. <acronym>IVR</acronym> systems and so on.
  120. </simpara>
  121. <simpara>
  122. User agents are often referred to as <emphasis>User Agent Server</emphasis>
  123. (UAS) and <emphasis>User Agent Client</emphasis> (UAC). UAS and UAC are
  124. logical entities only, each user agent contains a UAC and UAS. UAC is the
  125. part of the user agent that sends requests and receives responses. UAS is the
  126. part of the user agent that receives requests and sends responses.
  127. </simpara>
  128. <simpara>
  129. Because a user agent contains both UAC and UAS, we often say that a user
  130. agent behaves like a UAC or UAS. For instance, caller's user agent behaves
  131. like UAC when it sends an INVITE requests and receives responses to the
  132. request. Callee's user agent behaves like a UAS when it receives the INVITE
  133. and sends responses.
  134. </simpara>
  135. <simpara>
  136. But this situation changes when the callee decides to send a BYE and terminate
  137. the session. In this case the callee's user agent (sending BYE) behaves like
  138. UAC and the caller's user agent behaves like UAS.
  139. </simpara>
  140. <figure id="uac_and_uas">
  141. <title>UAC and UAS</title>
  142. <mediaobject>
  143. <imageobject>
  144. <imagedata fileref="figures/ua.png" format="PNG"/>
  145. </imageobject>
  146. <textobject>
  147. <phrase>Picture showing UAC and UAS</phrase>
  148. </textobject>
  149. </mediaobject>
  150. </figure>
  151. <simpara>
  152. <xref linkend="uac_and_uas"/> shows three user agents and one stateful forking
  153. proxy. Each user agent contains UAC and UAS. The part of the proxy that
  154. receives the INVITE from the caller in fact acts as a UAS. When forwarding the
  155. request statefully the proxy creates two UACs, each of them is responsible for
  156. one branch.
  157. </simpara>
  158. <simpara>
  159. In our example callee B picked up and later when he wants to tear down the call
  160. it sends a BYE. At this time the user agent that was previously UAS becomes a
  161. UAC and vice versa.
  162. </simpara>
  163. </section>
  164. <section id="proxy_servers">
  165. <title>Proxy Servers</title>
  166. <simpara>
  167. In addition to that SIP allows creation of an infrastructure of network hosts
  168. called <emphasis>proxy servers</emphasis>. User agents can send messages to a
  169. proxy server. Proxy servers are very important entities in the SIP
  170. infrastructure. They perform routing of a session invitations according to
  171. invitee's current location, authentication, accounting and many other important
  172. functions.
  173. </simpara>
  174. <simpara>
  175. The most important task of a proxy server is to route session invitations
  176. "closer" to callee. The session invitation will usually traverse a
  177. set of proxies until it finds one which knows the actual location of the
  178. callee. Such a proxy will forward the session invitation directly to the callee
  179. and the callee will then accept or decline the session invitation.
  180. </simpara>
  181. <simpara>
  182. There are two basic types of SIP proxy servers--stateless and stateful.
  183. </simpara>
  184. <section id="stateless_servers">
  185. <title>Stateless Servers</title>
  186. <simpara>
  187. Stateless server are simple message forwarders. They forward messages
  188. independently of each other. Although messages are usually arranged into
  189. transactions (see <xref linkend="sip_transactions"/>), stateless proxies
  190. do not take care of transactions.
  191. </simpara>
  192. <simpara>
  193. Stateless proxies are simple, but faster than stateful proxy servers. They
  194. can be used as simple load balancers, message translators and routers. One
  195. of drawbacks of stateless proxies is that they are unable to absorb
  196. retransmissions of messages and perform more advanced routing, for instance,
  197. forking or recursive traversal.
  198. </simpara>
  199. </section>
  200. <section id="stateful_servers">
  201. <title>Stateful Servers</title>
  202. <simpara>
  203. Stateful proxies are more complex. Upon reception of a request, stateful
  204. proxies create a state and keep the state until the transaction
  205. finishes. Some transactions, especially those created by INVITE, can last
  206. quite long (until callee picks up or declines the call). Because stateful
  207. proxies must maintain the state for the duration of the transactions, their
  208. performance is limited.
  209. </simpara>
  210. <simpara>
  211. The ability to associate SIP messages into transactions gives stateful
  212. proxies some interesting features. Stateful proxies can perform forking,
  213. that means upon reception of a message two or more messages will be sent
  214. out.
  215. </simpara>
  216. <simpara>
  217. Stateful proxies can absorb retransmissions because they know, from the
  218. transaction state, if they have already received the same message (stateless
  219. proxies cannot do the check because they keep no state).
  220. </simpara>
  221. <simpara>
  222. Stateful proxies can perform more complicated methods of finding a user. It
  223. is, for instance, possible to try to reach user's office phone and when he
  224. doesn't pick up then the call is redirected to his cell phone. Stateless
  225. proxies can't do this because they have no way of knowing how the
  226. transaction targeted to the office phone finished.
  227. </simpara>
  228. <simpara>
  229. Most SIP proxies today are stateful because their configuration is usually
  230. very complex. They often perform accounting, forking, some sort of NAT
  231. traversal aid and all those features require a stateful proxy.
  232. </simpara>
  233. </section>
  234. <section id="proxy_server_usage">
  235. <title>Proxy Server Usage</title>
  236. <simpara>
  237. A typical configuration is that each centrally administered entity (a
  238. company, for instance) has it's own SIP proxy server which is used by all
  239. user agents in the entity. Let's suppose that there are two companies A and
  240. B and each of them has it's own proxy server. <xref linkend="companies"/>
  241. shows how a session invitation from employee Joe in company A will reach
  242. employee Bob in company B.
  243. </simpara>
  244. <figure id="companies">
  245. <title>Session Invitation</title>
  246. <mediaobject>
  247. <imageobject>
  248. <imagedata fileref="figures/companies.png" format="PNG"/>
  249. </imageobject>
  250. <textobject>
  251. <phrase>Picture showing a session invitation message flow</phrase>
  252. </textobject>
  253. </mediaobject>
  254. </figure>
  255. <simpara>
  256. User Joe uses address sip:[email protected] to call Bob. Joe's user agent doesn't
  257. know how to route the invitation itself but it is configured to send all
  258. outbound traffic to the company SIP proxy server proxy.a.com. The proxy
  259. server figures out that user sip:[email protected] is in a different company so it
  260. will look up B's SIP proxy server and send the invitation there. B's proxy
  261. server can be either pre-configured at proxy.a.com or the proxy will use
  262. <acronym>DNS SRV</acronym> records to find B's proxy server. The invitation
  263. reaches proxy.bo.com. The proxy knows that Bob is currently sitting in his
  264. office and is reachable through phone on his desk, which has IP address
  265. 1.2.3.4, so the proxy will send the invitation there.
  266. </simpara>
  267. </section>
  268. </section>
  269. <section id="sip_intro.registrar">
  270. <title>Registrar</title>
  271. <simpara>
  272. We mentioned that the SIP proxy at proxy.b.com knows current Bob's location
  273. but haven't mentioned yet how a proxy can learn current location of a
  274. user. Bob's user agent (SIP phone) must register with a
  275. <emphasis>registrar</emphasis>. The registrar is a special SIP entity that
  276. receives registrations from users, extracts information about their current
  277. location (IP address, port and username in this case) and stores the
  278. information into location database. Purpose of the location database is to map
  279. sip:[email protected] to something like sip:[email protected]:5060. The location database is
  280. then used by B's proxy server. When the proxy receives an invitation for
  281. sip:[email protected] it will search the location database. It finds
  282. sip:[email protected]:5060 and will send the invitation there. A registrar is very
  283. often a logical entity only. Because of their tight coupling with proxies
  284. registrars, are usually co-located with proxy servers.
  285. </simpara>
  286. <simpara>
  287. <xref linkend="registrar_fig"/> shows a typical SIP registration. A REGISTER
  288. message containing Address of Record sip:[email protected] and contact address
  289. sip:[email protected]:5060 where 1.2.3.4 is IP address of the phone, is sent to the
  290. registrar. The registrar extracts this information and stores it into the
  291. location database. If everything went well then the registrar sends a 200 OK
  292. response to the phone and the process of registration is finished.
  293. </simpara>
  294. <figure id="registrar_fig">
  295. <title>Registrar Overview</title>
  296. <mediaobject>
  297. <imageobject>
  298. <imagedata fileref="figures/registrar.png" format="PNG"/>
  299. </imageobject>
  300. <textobject>
  301. <phrase>Picture showing a typical registrar</phrase>
  302. </textobject>
  303. </mediaobject>
  304. </figure>
  305. <simpara>
  306. Each registration has a limited lifespan. Expires header field or expires
  307. parameter of Contact header field determines for how long is the registration
  308. valid. The user agent must refresh the registration within the lifespan
  309. otherwise it will expire and the user will become unavailable.
  310. </simpara>
  311. </section>
  312. <section id="redirect_server">
  313. <title>Redirect Server</title>
  314. <simpara>
  315. The entity that receives a request and sends back a reply containing a list of the
  316. current location of a particular user is called <emphasis>redirect server</emphasis>. A
  317. redirect server receives requests and looks up the intended recipient of the request in
  318. the location database created by a registrar. It then creates a list of current
  319. locations of the user and sends it to the request originator in a response within 3xx
  320. class.
  321. </simpara>
  322. <simpara>
  323. The originator of the request then extracts the list of destinations and sends
  324. another request directly to them. <xref linkend="redirect"/> shows a typical
  325. redirection.
  326. </simpara>
  327. <figure id="redirect">
  328. <title>SIP Redirection</title>
  329. <mediaobject>
  330. <imageobject>
  331. <imagedata fileref="figures/redirect.png" format="PNG"/>
  332. </imageobject>
  333. <textobject>
  334. <phrase>Picture showing a redirection</phrase>
  335. </textobject>
  336. </mediaobject>
  337. </figure>
  338. </section>
  339. </section>
  340. <section id="sip_messages">
  341. <title>SIP Messages</title>
  342. <simpara>
  343. Communication using SIP (often called signaling) comprises of series of
  344. <emphasis>messages</emphasis>. Messages can be transported independently by the
  345. network. Usually they are transported in a separate UDP datagram each. Each
  346. message consist of "first line", message header, and message body. The
  347. first line identifies type of the message. There are two types of
  348. messages--<emphasis>requests</emphasis> and <emphasis>responses</emphasis>.
  349. Requests are usually used to initiate some action or inform recipient of the request
  350. of something. Replies are used to confirm that a request was received and processed
  351. and contain the status of the processing.
  352. </simpara>
  353. <simpara>
  354. A typical SIP request looks like this:
  355. </simpara>
  356. <programlisting>
  357. <![CDATA[
  358. INVITE sip:[email protected] SIP/2.0
  359. Via: SIP/2.0/UDP 195.37.77.100:5040;rport
  360. Max-Forwards: 10
  361. From: "jiri" <sip:[email protected]>;tag=76ff7a07-c091-4192-84a0-d56e91fe104f
  362. To: <sip:[email protected]>
  363. Call-ID: [email protected]
  364. CSeq: 2 INVITE
  365. Contact: <sip:213.20.128.35:9315>
  366. User-Agent: Windows RTC/1.0
  367. Proxy-Authorization: Digest username="jiri", realm="iptel.org",
  368. algorithm="MD5", uri="sip:[email protected]",
  369. nonce="3cef753900000001771328f5ae1b8b7f0d742da1feb5753c",
  370. response="53fe98db10e1074
  371. b03b3e06438bda70f"
  372. Content-Type: application/sdp
  373. Content-Length: 451
  374. v=0
  375. o=jku2 0 0 IN IP4 213.20.128.35
  376. s=session
  377. c=IN IP4 213.20.128.35
  378. b=CT:1000
  379. t=0 0
  380. m=audio 54742 RTP/AVP 97 111 112 6 0 8 4 5 3 101
  381. a=rtpmap:97 red/8000
  382. a=rtpmap:111 SIREN/16000
  383. a=fmtp:111 bitrate=16000
  384. a=rtpmap:112 G7221/16000
  385. a=fmtp:112 bitrate=24000
  386. a=rtpmap:6 DVI4/16000
  387. a=rtpmap:0 PCMU/8000
  388. a=rtpmap:4 G723/8000
  389. a=rtpmap: 3 GSM/8000
  390. a=rtpmap:101 telephone-event/8000
  391. a=fmtp:101 0-16
  392. ]]>
  393. </programlisting>
  394. <simpara>
  395. The first line tells us that this is INVITE message which is used to establish a
  396. session. The URI on the first line--sip:[email protected] is called <emphasis>Request
  397. URI</emphasis> and contains URI of the next hop of the message. In this case it
  398. will be host iptel.org.
  399. </simpara>
  400. <simpara>
  401. A SIP request can contain one or more Via header fields which are used to record
  402. path of the request. They are later used to route SIP responses exactly the same
  403. way. The INVITE message contains just one Via header field which was created by the
  404. user agent that sent the request. From the Via field we can tell that the user agent
  405. is running on host 195.37.77.100 and port 5060.
  406. </simpara>
  407. <simpara>
  408. From and To header fields identify initiator (caller) and recipient (callee) of the
  409. invitation (just like in SMTP where they identify sender and recipient of a
  410. message). From header field contains a tag parameter which serves as a dialog
  411. identifier and will be described in <xref linkend="sip_dialogs"/>.
  412. </simpara>
  413. <simpara>
  414. Call-ID header field is a dialog identifier and it's purpose is to identify messages
  415. belonging to the same call. Such messages have the same Call-ID identifier. CSeq is
  416. used to maintain order of requests. Because requests can be sent over an unreliable
  417. transport that can re-order messages, a sequence number must be present in the
  418. messages so that recipient can identify retransmissions and out of order requests.
  419. </simpara>
  420. <simpara>
  421. Contact header field contains IP address and port on which the sender is awaiting
  422. further requests sent by callee. Other header fields are not important and will be
  423. not described here.
  424. </simpara>
  425. <simpara>
  426. Message header is delimited from message body by an empty line. Message body of the INVITE
  427. request contains a description of the media type accepted by the sender and encoded in
  428. SDP.
  429. </simpara>
  430. <section id="sip_requests">
  431. <title>SIP Requests</title>
  432. <simpara>
  433. We have described how an INVITE request looks like and said that the request is
  434. used to invite a callee to a session. Other important requests are:
  435. </simpara>
  436. <itemizedlist>
  437. <listitem>
  438. <simpara>
  439. <emphasis>ACK</emphasis>--This message acknowledges receipt of a final
  440. response to INVITE. Establishing of a session utilizes 3-way
  441. hand-shaking due to asymmetric nature of the invitation. It may take a
  442. while before the callee accepts or declines the call so the callee's
  443. user agent periodically retransmits a positive final response until it
  444. receives an ACK (which indicates that the caller is still there and
  445. ready to communicate).
  446. </simpara>
  447. </listitem>
  448. <listitem>
  449. <simpara>
  450. <emphasis>BYE</emphasis>--Bye messages are used to tear down multimedia
  451. sessions. A party wishing to tear down a session sends a BYE to the
  452. other party.
  453. </simpara>
  454. </listitem>
  455. <listitem>
  456. <simpara>
  457. <emphasis>CANCEL</emphasis>--Cancel is used to cancel not yet fully
  458. established session. It is used when the callee hasn't replied with a
  459. final response yet but the caller wants to abort the call (typically
  460. when a callee doesn't respond for some time).
  461. </simpara>
  462. </listitem>
  463. <listitem>
  464. <simpara>
  465. <emphasis>REGISTER</emphasis>--Purpose of REGISTER request is to let
  466. registrar know of current user's location. Information about current
  467. IP address and port on which a user can be reached is carried in
  468. REGISTER messages. Registrar extracts this information and puts it into
  469. a location database. The database can be later used by SIP proxy
  470. servers to route calls to the user. Registrations are time-limited and
  471. need to be periodically refreshed.
  472. </simpara>
  473. </listitem>
  474. </itemizedlist>
  475. <simpara>
  476. The listed requests usually have no message body because it is not needed in
  477. most situations (but can have one). In addition to that many other request types
  478. have been defined but their description is out of the scope of this document.
  479. </simpara>
  480. </section>
  481. <section id="sip_responses">
  482. <title>SIP Responses</title>
  483. <simpara>
  484. When a user agent or proxy server receives a request it send a reply. Each
  485. request must be replied except ACK requests which trigger no replies.
  486. </simpara>
  487. <simpara>
  488. A typical reply looks like this:
  489. </simpara>
  490. <programlisting>
  491. <![CDATA[
  492. SIP/2.0 200 OK
  493. Via: SIP/2.0/UDP 192.168.1.30:5060;received=66.87.48.68
  494. From: sip:[email protected]
  495. To: sip:[email protected];tag=794fe65c16edfdf45da4fc39a5d2867c.b713
  496. Call-ID: [email protected]
  497. CSeq: 63629 REGISTER
  498. Contact: Msip:[email protected]:5060;transport=udp>;q=0.00;expires=120
  499. Server: Sip EXpress router (0.8.11pre21xrc (i386/linux))
  500. Content-Length: 0
  501. Warning: 392 195.37.77.101:5060 "Noisy feedback tells:
  502. pid=5110 req_src_ip=66.87.48.68 req_src_port=5060 in_uri=sip:iptel.org
  503. out_uri=sip:iptel.org via_cnt==1"
  504. ]]>
  505. </programlisting>
  506. <simpara>
  507. As we can see, responses are very similar to the requests, except for the first
  508. line. The first line of response contains protocol version (SIP/2.0), reply
  509. code, and reason phrase.
  510. </simpara>
  511. <simpara>
  512. The <emphasis>reply code</emphasis> is an integer number from 100 to 699 and
  513. indicates type of the response. There are 6 classes of responses:
  514. </simpara>
  515. <itemizedlist>
  516. <listitem>
  517. <simpara>
  518. <emphasis>1xx</emphasis> are <emphasis>provisional</emphasis>
  519. responses. A provisional response is response that tells to its
  520. recipient that the associated request was received but result of the
  521. processing is not known yet. Provisional responses are sent only when
  522. the processing doesn't finish immediately. The sender must stop
  523. retransmitting the request upon reception of a provisional response.
  524. </simpara>
  525. <simpara>
  526. Typically proxy servers send responses with code 100 when they start
  527. processing an INVITE and user agents send responses with code 180
  528. (Ringing) which means that the callee's phone is ringing.
  529. </simpara>
  530. </listitem>
  531. <listitem>
  532. <simpara>
  533. <emphasis>2xx</emphasis> responses are <emphasis>positive
  534. final</emphasis> responses. A final response is the ultimate response
  535. that the originator of the request will ever receive. Therefore final
  536. responses express result of the processing of the associated
  537. request. Final responses also terminate transactions. Responses with
  538. code from 200 to 299 are positive responses that means that the request
  539. was processed successfully and accepted. For instance a 200 OK response
  540. is sent when a user accepts invitation to a session (INVITE request).
  541. </simpara>
  542. <simpara>
  543. A UAC may receive several 200 messages to a single INVITE
  544. request. This is because a forking proxy (described later) can fork the
  545. request so it will reach several UAS and each of them will accept the
  546. invitation. In this case each response is distinguished by the tag
  547. parameter in To header field. Each response represents a distinct dialog
  548. with unambiguous dialog identifier.
  549. </simpara>
  550. </listitem>
  551. <listitem>
  552. <simpara>
  553. <emphasis>3xx</emphasis> responses are used to redirect a caller. A
  554. redirection response gives information about the user's new location or
  555. an alternative service that the caller might use to satisfy the
  556. call. Redirection responses are usually sent by proxy servers. When a
  557. proxy receives a request and doesn't want or can't process it for any
  558. reason, it will send a redirection response to the caller and put
  559. another location into the response which the caller might want to
  560. try. It can be the location of another proxy or the current location of
  561. the callee (from the location database created by a registrar). The
  562. caller is then supposed to re-send the request to the new location. 3xx
  563. responses are final.
  564. </simpara>
  565. </listitem>
  566. <listitem>
  567. <simpara>
  568. <emphasis>4xx</emphasis> are <emphasis>negative final</emphasis>
  569. responses. a 4xx response means that the problem is on the sender's
  570. side. The request couldn't be processed because it contains bad syntax
  571. or cannot be fulfilled at that server.
  572. </simpara>
  573. </listitem>
  574. <listitem>
  575. <simpara>
  576. <emphasis>5xx</emphasis> means that the problem is on server's side. The
  577. request is apparently valid but the server failed to fulfill it. Clients
  578. should usually retry the request later.
  579. </simpara>
  580. </listitem>
  581. <listitem>
  582. <simpara>
  583. <emphasis>6xx</emphasis> reply code means that the request cannot be
  584. fulfilled at any server. This response is usually sent by a server that
  585. has definitive information about a particular user. User agents usually
  586. send a 603 Decline response when the user doesn't want to participate in
  587. the session.
  588. </simpara>
  589. </listitem>
  590. </itemizedlist>
  591. <simpara>
  592. In addition to the response class the first line also contains <emphasis>reason
  593. phrase</emphasis>. The code number is intended to be processed by
  594. machines. It is not very human-friendly but it is very easy to parse and
  595. understand by machines. The reason phrase usually contains a human-readable
  596. message describing the result of the processing. A user agent should render
  597. the reason phrase to the user.
  598. </simpara>
  599. <simpara>
  600. The request to which a particular response belongs is identified using the CSeq
  601. header field. In addition to the sequence number this header field also contains
  602. method of corresponding request. In our example it was REGISTER request.
  603. </simpara>
  604. </section>
  605. </section>
  606. <section id="sip_transactions">
  607. <title>SIP Transactions</title>
  608. <simpara>
  609. Although we said that SIP messages are sent independently over the network, they
  610. are usually arranged into <emphasis>transactions</emphasis> by user agents and
  611. certain types of proxy servers. Therefore SIP is said to be a
  612. <emphasis>transactional protocol</emphasis>.
  613. </simpara>
  614. <simpara>
  615. A transaction is a sequence of SIP messages exchanged between SIP network
  616. elements. A transaction consists of one request and all responses to that
  617. request. That includes zero or more provisional responses and one or more final
  618. responses (remember that an INVITE might be answered by more than one final response
  619. when a proxy server forks the request).
  620. </simpara>
  621. <simpara>
  622. If a transaction was initiated by an INVITE request then the same transaction also
  623. includes ACK, but only if the final response was not a 2xx response. If the final
  624. response was a 2xx response then the ACK is not considered part of the transaction.
  625. </simpara>
  626. <simpara>
  627. As we can see this is quite asymmetric behavior--ACK is part of transactions with a
  628. negative final response but is not part of transactions with positive final
  629. responses. The reason for this separation is the importance of delivery of all 200
  630. OK messages. Not only that they establish a session, but also 200 OK can be
  631. generated by multiple entities when a proxy server forks the request and all of them
  632. must be delivered to the calling user agent. Therefore user agents take
  633. responsibility in this case and retransmit 200 OK responses until they receive an
  634. ACK. Also note that only responses to INVITE are retransmitted !
  635. </simpara>
  636. <simpara>
  637. SIP entities that have notion of transactions are called
  638. <emphasis>stateful</emphasis>. Such entities usually create a state associated with
  639. a transaction that is kept in the memory for the duration of the transaction. When a
  640. request or response comes, a stateful entity tries to associate the request (or
  641. response) to existing transactions. To be able to do it it must extract a unique
  642. transaction identifier from the message and compare it to identifiers of all
  643. existing transactions. If such a transaction exists then it's state gets updated
  644. from the message.
  645. </simpara>
  646. <simpara>
  647. In the previous SIP RFC2543 the transaction identifier was calculated as hash of
  648. all important message header fields (that included To, From, Request-URI and
  649. CSeq). This proved to be very slow and complex, during interoperability tests such
  650. transaction identifiers used to be a common source of problems.
  651. </simpara>
  652. <simpara>
  653. In the new RFC3261 the way of calculating transaction identifiers was completely
  654. changed. Instead of complicated hashing of important header fields a SIP message now
  655. includes the identifier directly. Branch parameter of Via header fields contains directly
  656. the transaction identifier. This is significant simplification, but there still exist old
  657. implementations that don't support the new way of calculating of transaction identifier so
  658. even new implementations have to support the old way. They must be backwards compatible.
  659. </simpara>
  660. <simpara>
  661. <xref linkend="transactions"/> shows what messages belong to what transactions
  662. during a conversation of two user agents.
  663. </simpara>
  664. <figure id="transactions">
  665. <title>SIP Transactions</title>
  666. <mediaobject>
  667. <imageobject>
  668. <imagedata fileref="figures/transaction.png" format="PNG"/>
  669. </imageobject>
  670. <textobject>
  671. <phrase>Message flow showing messages belonging to the same transaction.</phrase>
  672. </textobject>
  673. </mediaobject>
  674. </figure>
  675. </section>
  676. <section id="sip_dialogs">
  677. <title>SIP Dialogs</title>
  678. <simpara>
  679. We have shown what transactions are, that one transaction includes INVITE and it's
  680. responses and another transaction includes BYE and it responses when a session is
  681. being torn down. But we feel that those two transactions should be somehow
  682. related--both of them belong to the same <emphasis>dialog</emphasis>. A dialog
  683. represents a peer-to-peer SIP relationship between two user agents. A dialog
  684. persists for some time and it is very important concept for user agents. Dialogs
  685. facilitate proper sequencing and routing of messages between SIP endpoints.
  686. </simpara>
  687. <simpara>
  688. Dialogs are identified using Call-ID, From tag, and To
  689. tag. Messages that have these three identifiers same belong to the
  690. same dialog. We have shown that CSeq header field is used to order
  691. messages, in fact it is used to order messages within a dialog. The
  692. number must be monotonically increased for each message sent within
  693. a dialog otherwise the peer will handle it as out of order request
  694. or retransmission. In fact, the CSeq number identifies a
  695. transaction within a dialog because we have said that requests and
  696. associated responses are called transaction. This means that only
  697. one transaction in each direction can be active within a
  698. dialog. One could also say that a <emphasis>dialog is a sequence of
  699. transactions</emphasis>. <xref linkend="dialog"/> extends <xref
  700. linkend="transactions"/> to show which messages belong to the
  701. same dialog.
  702. </simpara>
  703. <figure id="dialog">
  704. <title>SIP Dialog</title>
  705. <mediaobject>
  706. <imageobject>
  707. <imagedata fileref="figures/dialog.png" format="PNG"/>
  708. </imageobject>
  709. <textobject>
  710. <phrase>Message flow showing transactions belonging to the same dialog.</phrase>
  711. </textobject>
  712. </mediaobject>
  713. </figure>
  714. <simpara>
  715. Some messages establish a dialog and some do not. This allows to explicitly express
  716. the relationship of messages and also to send messages that are not related to other
  717. messages outside a dialog. That is easier to implement because user agent don't have
  718. to keep the dialog state.
  719. </simpara>
  720. <simpara>
  721. For instance, INVITE message establishes a dialog, because it will be later followed
  722. by BYE request which will tear down the session established by the INVITE. This BYE
  723. is sent within the dialog established by the INVITE.
  724. </simpara>
  725. <simpara>
  726. But if a user agent sends a MESSAGE request, such a request doesn't establish any
  727. dialog. Any subsequent messages (even MESSAGE) will be sent independently of the
  728. previous one.
  729. </simpara>
  730. <section id="dialogs_facilitate_routing">
  731. <title>Dialogs Facilitate Routing</title>
  732. <simpara>
  733. We have said that dialogs are also used to route the messages between user
  734. agents, let's describe this a little bit.
  735. </simpara>
  736. <simpara>
  737. Let's suppose that user sip:[email protected] wants to talk to user sip:[email protected]. He
  738. knows SIP address of the callee (sip:[email protected]) but this address doesn't say
  739. anything about current location of the user--i.e. the caller doesn't know to
  740. which host to send the request. Therefore the INVITE request will be sent to a
  741. proxy server.
  742. </simpara>
  743. <simpara>
  744. The request will be sent from proxy to proxy until it reaches one that knows
  745. current location of the callee. This process is called routing. Once the request
  746. reaches the callee, the callee's user agent will create a response that will be
  747. sent back to the caller. Callee's user agent will also put Contact header field
  748. into the response which will contain the current location of the user. The
  749. original request also contained Contact header field which means that both user
  750. agents know the current location of the peer.
  751. </simpara>
  752. <simpara>
  753. Because the user agents know location of each other, it is not necessary to send
  754. further requests to any proxy--they can be sent directly from user agent to user
  755. agent. That's exactly how dialogs facilitate routing.
  756. </simpara>
  757. <simpara>
  758. Further messages within a dialog are sent directly from user agent to user
  759. agent. This is a significant performance improvement because proxies do not see
  760. all the messages within a dialog, they are used to route just the first request
  761. that establishes the dialog. The direct messages are also delivered with much
  762. smaller latency because a typical proxy usually implements complex routing
  763. logic. <xref linkend="trapezoid"/> contains an example of a message
  764. within a dialog (BYE) that bypasses the proxies.
  765. </simpara>
  766. <figure id="trapezoid">
  767. <title>SIP Trapezoid</title>
  768. <mediaobject>
  769. <imageobject>
  770. <imagedata fileref="figures/trapezoid.png" format="PNG"/>
  771. </imageobject>
  772. <textobject>
  773. <phrase>Message flow showing SIP trapezoid.</phrase>
  774. </textobject>
  775. </mediaobject>
  776. </figure>
  777. </section>
  778. <section id="dialogs_identifiers">
  779. <title>Dialog Identifiers</title>
  780. <simpara>
  781. We have already shown that dialog identifiers consist of three parts, Call-Id,
  782. From tag, and To tag, but it is not that clear why are dialog identifiers
  783. created exactly this way and who contributes which part.
  784. </simpara>
  785. <simpara>
  786. Call-ID is so called <emphasis>call identifier</emphasis>. It must be a unique
  787. string that identifies a call. A call consists of one or more dialogs. Multiple
  788. user agents may respond to a request when a proxy along the path forks the
  789. request. Each user agent that sends a 2xx establishes a separate dialog with the
  790. caller. All such dialogs are part of the same call and have the same Call-ID.
  791. </simpara>
  792. <simpara>
  793. From tag is generated by the caller and it uniquely identifies the dialog in the
  794. caller's user agent.
  795. </simpara>
  796. <simpara>
  797. To tag is generated by a callee and it uniquely identifies, just like From tag,
  798. the dialog in the callee's user agent.
  799. </simpara>
  800. <simpara>
  801. This hierarchical dialog identifier is necessary because a single call
  802. invitation can create several dialogs and caller must be able to distinguish
  803. them.
  804. </simpara>
  805. </section>
  806. </section>
  807. <section id="typical_sip_scenarios">
  808. <title>Typical SIP Scenarios</title>
  809. <simpara>
  810. This section gives a brief overview of typical SIP scenarios that usually make up the
  811. SIP traffic.
  812. </simpara>
  813. <section id="registration">
  814. <title>Registration</title>
  815. <simpara>
  816. Users must register themselves with a registrar to be reachable by other
  817. users. A registration comprises a REGISTER message followed by a 200 OK sent by
  818. registrar if the registration was successful. Registrations are usually
  819. authorized so a 407 reply can appear if the user didn't provide valid
  820. credentials. <xref linkend="register_fig"/> shows an example of registration.
  821. </simpara>
  822. <figure id="register_fig">
  823. <title>REGISTER Message Flow</title>
  824. <mediaobject>
  825. <imageobject>
  826. <imagedata fileref="figures/register.png" format="PNG"/>
  827. </imageobject>
  828. <textobject>
  829. <phrase>Message flow of a registration.</phrase>
  830. </textobject>
  831. </mediaobject>
  832. </figure>
  833. </section>
  834. <section id="session_invitation">
  835. <title>Session Invitation</title>
  836. <simpara>
  837. A session invitation consists of one INVITE request which is usually sent to a
  838. proxy. The proxy sends immediately a 100 Trying reply to stop retransmissions
  839. and forwards the request further.
  840. </simpara>
  841. <simpara>
  842. All provisional responses generated by callee are sent back to the caller. See
  843. 180 Ringing response in the call flow. The response is generated when callee's
  844. phone starts ringing.
  845. </simpara>
  846. <figure id="invite1">
  847. <title>INVITE Message Flow</title>
  848. <mediaobject>
  849. <imageobject>
  850. <imagedata fileref="figures/invite1.png" format="PNG"/>
  851. </imageobject>
  852. <textobject>
  853. <phrase>Picture showing a session invitation.</phrase>
  854. </textobject>
  855. </mediaobject>
  856. </figure>
  857. <simpara>
  858. A 200 OK is generated once the callee picks up the phone and it is retransmitted
  859. by the callee's user agent until it receives an ACK from the caller. The session
  860. is established at this point.
  861. </simpara>
  862. </section>
  863. <section id="session_termination">
  864. <title>Session Termination</title>
  865. <simpara>
  866. Session termination is accomplished by sending a BYE request within dialog
  867. established bye INVITE. BYE messages are sent directly from one user agent to
  868. the other unless a proxy on the path of the INVITE request indicated that it
  869. wishes to stay on the path by using record routing (see <xref
  870. linkend="record_routing"/>.
  871. </simpara>
  872. <simpara>
  873. Party wishing to tear down a session sends a BYE request to the other party
  874. involved in the session. The other party sends a 200 OK response to confirm the
  875. BYE and the session is terminated. See <xref linkend="bye"/>, left message
  876. flow.
  877. </simpara>
  878. </section>
  879. <section id="record_routing">
  880. <title>Record Routing</title>
  881. <simpara>
  882. All requests sent within a dialog are by default sent directly from one user agent
  883. to the other. Only requests outside a dialog traverse SIP proxies. This approach
  884. makes SIP network more scalable because only a small number of SIP messages hit
  885. the proxies.
  886. </simpara>
  887. <simpara>
  888. There are certain situations in which a SIP proxy need to stay on the path of all
  889. further messages. For instance, proxies controlling a NAT box or proxies doing
  890. accounting need to stay on the path of BYE requests.
  891. </simpara>
  892. <simpara>
  893. Mechanism by which a proxy can inform user agents that it wishes to stay on the path
  894. of all further messages is called <emphasis>record routing</emphasis>. Such a proxy
  895. would insert Record-Route header field into SIP messages which contain address of
  896. the proxy. Messages sent within a dialog will then traverse all SIP proxies that
  897. put a Record-Route header field into the message.
  898. </simpara>
  899. <simpara>
  900. The recipient of the request receives a set of Record-Route header fields in the
  901. message. It must mirror all the Record-Route header fields into responses because
  902. the originator of the request also needs to know the set of proxies.
  903. </simpara>
  904. <figure id="bye">
  905. <title>BYE Message Flow (With and without Record Routing)</title>
  906. <mediaobject>
  907. <imageobject>
  908. <imagedata fileref="figures/bye.png" format="PNG"/>
  909. </imageobject>
  910. <textobject>
  911. <phrase>Picture showing BYE message flow with and without record routing.</phrase>
  912. </textobject>
  913. </mediaobject>
  914. </figure>
  915. <simpara>
  916. Left message flow of <xref linkend="bye"/> show how a BYE (request
  917. within dialog established by INVITE) is sent directly to the other user agent
  918. when there is no Record-Route header field in the message. Right message flow
  919. show how the situation changes when the proxy puts a Record-Route header field
  920. into the message.
  921. </simpara>
  922. <section id="strict_vs_loose">
  923. <title>Strict versus Loose Routing</title>
  924. <simpara>
  925. The way how record routing works has evolved. Record routing according to
  926. RFC2543 rewrote the Request-URI. That means the Request-URI always
  927. contained URI of the next hop (which can be either next proxy server which
  928. inserted Record-Route header field or destination user agent). Because of
  929. that it was necessary to save the original Request-URI as the last Route
  930. header field. This approach is called <emphasis>strict routing</emphasis>.
  931. </simpara>
  932. <simpara>
  933. <emphasis>Loose routing</emphasis>, as specified in RFC3261, works in a
  934. little bit different way. The Request-URI is no more overwritten, it always
  935. contains URI of the destination user agent. If there are any Route header
  936. field in a message, than the message is sent to the URI from the topmost
  937. Route header field. This is significant change--Request-URI doesn't
  938. necessarily contain URI to which the request will be sent. In fact, loose
  939. routing is very similar to IP source routing.
  940. </simpara>
  941. <simpara>
  942. Because transit from strict routing to loose routing would break backwards
  943. compatibility and older user agents wouldn't work, it is necessary to make
  944. loose routing backwards compatible. The backwards compatibility
  945. unfortunately adds a lot of overhead and is often source of major problems.
  946. </simpara>
  947. </section>
  948. </section>
  949. <section id="sub_not">
  950. <title>Event Subscription And Notification</title>
  951. <simpara>
  952. The SIP specification has been extended to support a general mechanism allowing
  953. subscription to asynchronous events. Such evens can include SIP proxy statistics
  954. changes, presence information, session changes and so on.
  955. </simpara>
  956. <simpara>
  957. The mechanism is used mainly to convey information on presence (willingness to
  958. communicate) of users. <xref linkend="event"/> shows the basic message
  959. flow.
  960. </simpara>
  961. <figure id="event">
  962. <title>Event Subscription And Notification</title>
  963. <mediaobject>
  964. <imageobject>
  965. <imagedata fileref="figures/event.png" format="PNG"/>
  966. </imageobject>
  967. <textobject>
  968. <phrase>Picture showing subscription and notification.</phrase>
  969. </textobject>
  970. </mediaobject>
  971. </figure>
  972. <simpara>
  973. A user agent interested in event notification sends a SUBSCRIBE message to a
  974. SIP server. The SUBSCRIBE message establishes a dialog and is immediately
  975. replied by the server using 200 OK response. At this point the dialog is
  976. established. The server sends a NOTIFY request to the user every time the event
  977. to which the user subscribed changes. NOTIFY messages are sent within the dialog
  978. established by the SUBSCRIBE.
  979. </simpara>
  980. <simpara>
  981. Note that the first NOTIFY message in <xref linkend="event"/> is sent
  982. regardless of any event that triggers notifications.
  983. </simpara>
  984. <simpara>
  985. Subscriptions--as well as registrations--have limited lifespan and therefore must be
  986. periodically refreshed.
  987. </simpara>
  988. </section>
  989. <section id="im">
  990. <title>Instant Messages</title>
  991. <simpara>
  992. Instant messages are sent using MESSAGE request. MESSAGE requests do not establish a
  993. dialog and therefore they will always traverse the same set of proxies. This is the
  994. simplest form of sending instant messages. The text of the instant message is
  995. transported in the body of the SIP request.
  996. </simpara>
  997. <figure id="message">
  998. <title>Instant Messages</title>
  999. <mediaobject>
  1000. <imageobject>
  1001. <imagedata fileref="figures/message.png" format="PNG"/>
  1002. </imageobject>
  1003. <textobject>
  1004. <phrase>Picture showing a MESSAGE.</phrase>
  1005. </textobject>
  1006. </mediaobject>
  1007. </figure>
  1008. </section>
  1009. </section>
  1010. </section>