Преглед изворни кода

initial technical memos submitted

Jiri Kuthan пре 23 година
родитељ
комит
9cd60a0081
4 измењених фајлова са 947 додато и 2 уклоњено
  1. 19 2
      doc/tmemo/README
  2. 259 0
      doc/tmemo/tmemo-jiri-b2bua.txt
  3. 395 0
      doc/tmemo/tmemo-jiri-media.txt
  4. 274 0
      doc/tmemo/tmemo-jiri-vmail.txt

+ 19 - 2
doc/tmemo/README

@@ -1,2 +1,19 @@
-This directory contains short memos documenting design decisions
-made in ser or accompanying applications.
+This directory contains short technical memos documenting 
+technical decisions made or planned to be made in ser or 
+accompanying applications. The memos serve as requests
+for comments in the literal sense (not to be confused
+with IETF's RFCs).
+
+The documents here are drafts, for whose technical maturity
+no guarantee can be provided.  They may advocate non-workable
+design ideas, frequently change, or be replaced by better
+technology suggestions. They may or may not be implemented.
+
+The memo texts follows IETF traditions: they are encoded
+in plain ASCII. Their filenames consists of
+ - tmemo (=technical memo -- to avoid confusion with IETF prefixes)
+ - author id
+ - text id
+For example: tmemo-johndoe-backtobackua.txt
+No version numbers are used in filename -- these are displayed 
+in text and assigned by CVS server. 

+ 259 - 0
doc/tmemo/tmemo-jiri-b2bua.txt

@@ -0,0 +1,259 @@
+$Id$
+
+Building Prepaid Scenarios Using SIP/SER
+========================================
+
+Jiri Kuthan, iptel.org, January 2003
+
+Abstract
+--------
+Prepaid scenarios for making calls to PSTN gateways require the 
+ability to terminate an exising call when user's credit is 
+exhausted. Though it seems appropriate to implement such 
+a feature in the device providing the service (i.e., in the gateway),
+we are currently not aware of such gateways. We thus first
+recommend a session-timer based approach which possibly works,
+and requires limited support in end-devices (session-timer)
+and proxy servers (session-timer and call length determination).
+We then discuss another alternative, based on a B2BUA middlebox,
+which works even with the dumbest PSTN gateways but puts
+a considerable workload on SIP server.
+
+TOC
+---
+Section 1 explains design alternatives which can be made when
+designed a "forced" call termination (FCT). The design alternatives
+are FCT support in end-devices (nice, but not doable with current 
+gateways), FCT support using session-timer (nice, hopefuly doable, 
+requires session-timer support) and FCT using a B2BUA (ugly
+and costly, but best backwards-compatible).
+
+Section 2 details known drawbacks of the B2BUA technology.
+
+Preliminary hints how to implement the B2BUA using ser,
+which has no B2BUA support, are detailed in section 3.
+
+1. How To Terminate a Call When No Money Is Left
+--------------------------------------------------
+
+In general, there are many ways to implement a service operator
+driven call cut-off. We argue, that architecturally best acceptable
+place for this functionality is in the terminating PSTN end-device.
+The device already keeps session state, it knows too when things
+go wrong on the PSTN site, it is able to detect caller's media
+inactivity -- it is simply full in control of the call. Thus it
+seems an ideal place for implementing a call termination functionality.
+No other element in the network knows all the things the end-device 
+knows. 
+
+The missing piece is then the ability to determine maximum call
+duration. A consequent application of the approach of placing the
+logic in end-device would make the gateway query some database.
+(It is of limited use to include this information directly in
+in gateway, as multiple devices may want to share this piece of
+information.) However support of such a "query-credit" protocol
+does not exist in PSTN gateways. Other solutions are thus sought.
+
+One way to make the gateway aware of the maximum call duration is
+to determine it in a proxy server (which typically has programming
+capabilities that allow doing so) and propagating it to gateways 
+using SIP session timer. 
+  http://www.iptel.org/ietf/callsignalling/#draft-ietf-sip-session-timer
+This solution is scalable in that the element determining the maximum
+call length is a proxy server, which is at most transactionally
+stateful. No call state needs is maintained except in the
+end-devices. 
+
+The behaviour of the session-timer-based construct is as follows:
+a caller intiates a call through a proxy server. The proxy server
+determines maximum acceptable call length and inidiacates it using
+the session timer mechanisms. The timer is then propagates to
+the end-device using SIP. If it actually hits, the terminating 
+gateway will try to revitalize the session using a re-INVITE. 
+The proxy server then can recalculate available credit, and if too 
+low, deny the re-invitation. The end-device is then supposed 
+to terminate the call using a BYE.
+
+We have never experimented with the session-timer-based solution.
+We do not know if some session timer negotiation troubles can
+occur. We do not know how widely support of session timer is
+deployed in gateways. We do not know whether the standardization
+effort for session-timer will result in some changes and when
+it will complete. Nevertheless, we think it is worth trying.
+Its appeal is it leaves call-termination, a call-stateful
+feature, down in the end-devices and does not pose too big
+burden on server developers and especially operators.
+
+WE THUS ENCOURAGE VOLUNTEERS TO EXPERIMENT WITH THIS OPTION.
+TAKE THE GATEWAY YOU HAVE, LOOK AT IT IF IT SUPPORTS ST,
+ADD ST TO SER PROXY AND CHECK IF THINGS WORK.
+
+
+
+2. What Are B2BUA limitations?
+------------------------------
+B2BUA features all drawbacks of a centralized solution. Whereas
+B2BUAs are applicable in the prepaid scenarios one should not
+forget the price.
+
+a) it is a single point of failure. When in the middle of
+   a conversation additional sigaling occurs and the B2BUA
+   is down, signaling will fail. (Doesn't happen if signaling
+   runs only between end-points.) Call persistency must be
+   implemented, signaling will otherwise fail on server
+   reboot.
+b) scalability issues: a B2BUA needs to keep state for two
+   calls for the whole duration of a conversation. That might
+   be an issue with too heavy traffic. Transaction state
+   takes 3k per transaction and lasts seconds. Call state
+   consumes at least twice so much and lasts minutes.
+c) e2e security does not work -- implementations willing to
+   achieve high security will want to encrypt and sign
+   SIP message bodies. B2BUA breaks the e2e security if
+   it needs to change the body.
+d) economical aspects: it is simply yet another piece of
+   software you need to purchase or develop
+
+Lot of this conversation has taken place on IETF SIP
+and SIPPING mailing lists. Few messages from these
+discussions are referred from  
+   http://www.iptel.org/info/trends/#b2bua
+
+3. How to Implement a B2BUA Using ser
+-------------------------------------
+
+
+
+At 10:00 AM 1/6/2003, chang hui wrote:
+>Jiri,
+>
+>Thanks for your explanation, and let me know the architecture drawback of the B2BUA.
+
+
+I've already done so in my previous email. If something was not clear
+enough, let me know.
+
+
+>Since we have no way to choose other means to implement pre-paid, we have to go along with B2BUA in a short term.
+>Could you give me any advise how to implement B2BUA based on SER and estimate the work we should do?
+>Could you give me a performance estimate?
+
+
+A hand-waving guestimate is performance degrades by 50%.
+(We currently achieve up to 3-5 kCPS on a PC -- fair capacity
+ to slice off from.).
+
+
+a B2BUA does a lot of things:
+- first, it keeps dialog state -- it rememembers cseq,  callid, 
+  route set, etc. for the whole time of a call (i.e., it eats 
+  memory). All this information is needed when you later wish 
+  to initiate correct BYEs.
+- it translates UAC to UAS transactions and vice versa
+- you probably want to save the dialog state on some persistence
+  storage (mysql) -- signaling would not work on reboot otherwise
+
+
+That would take quite some development work. I think the amount
+of work can be somewhat lowered if normal (record-routed) proxy 
+processing is used, as opposed to a full B2BUA which terminats
+all UAS transactions and translates them to UAC transactions.
+You then still need to do the following:
+- keeping a dialog table (keyed by callid and local/remote tags)
+- updating the dialog table (new items on INVITE completion, remove 
+  dialogs on BYE, update dialog state, such as CSeq, on any other 
+  request).
+- starting a timer on beginning of a dialog that -- when expired,
+  subject to balance and charging plans --  sends BYEs to all call  
+  parties using dialog context.
+
+
+That could be implemented as a new ser module, which registers
+TM-module callbacks to be updated on transactions completions.
+One could also move the dialog maintenance out of ser to some
+shell scripts to make programming easier. That would however
+very likely degrade performance noticeably.
+
+
+Also note, that these scenarios are based only on signaling -- there
+are no PSTN-prepaid-style anouncement "you can call 5 minutes"
+and "your call will be cut off in 20 seconds". It is doable too,
+but it is probably more meaningful to start with the signaling
+part.
+
+
+-Jiri
+
+
+
+>Best Regards and Thanks.
+>
+>
+>Chang Hui
+>-----Original Message-----
+>From: Jiri Kuthan [mailto:[email protected]]
+>Sent: Saturday, January 04, 2003 8:29 PM
+>To: chang hui; [email protected]
+>Subject: RE: [Serusers] About B2BUA
+>
+>Hello,
+>
+>I see -- prepaid scenarios are indeed difficult without a B2BUA.
+>There has been a proposal few times to use session timer (a proxy
+>looks at ballance and attaches a hint to SIP requests indicating
+>when a call should terminate), but the work has not been pursued.
+>
+>You may find a discussion of B2BUA architectural drawbacks on the
+>SIP mailing list, selected postings are at http://www.iptel.org/info/trends/#b2bua.
+>imho, the most compelling issue is that of robustness and scalability.
+>A b2bua needs to keep track of all current calls. A broken b2bua affects
+>signaling for all existing calls.
+>
+>Basically, a B2BUA is simply two UAs glued together. It accepts
+>transactions as a server, and initiates client transactions
+>based on them. It keeps dialog state (callid, cseqs, etc.) and
+>may initiate in-dialog transactions on its own (like the BYE
+>transaction in which you are interested).
+>
+>It is doable to implement a B2BUA on top of ser, but it would
+>cost quite some development effort. Particularly, it would take
+>dialog maintenance (better with persistency so that signaling
+>does not get broken on reboot). We  can provide guidanance to
+>volunteers willing to go through this exercise.
+>
+>-Jiri
+>
+>At 02:28 AM 1/4/2003, chang hui wrote:
+>>Jiri,
+>>
+>>Thanks for your prompt response.
+>>We want to implement a pre-paid system in which once subscriber's balance is depleted, the dialog could be torn in time. However other Proxy or other elements could not take part in the call, they could not send a BYE to caller directly. It's the why we consider B2BUA.
+>>We project to build a B2BUA to support voice/video/IM at first stage, and support other SIP based services as they emerged.
+>>However, I just noticed the definition of B2BUA in 2543-bis04 in several sentences,  there has no other analysis on performance, reliability, limitations and how to implement it. So, I hope to get help from the society.
+>>Thanks for your help again.
+>>
+>>Koalas
+>>
+>>-----Original Message-----
+>>From: Jiri Kuthan [mailto:[email protected]]
+>>Sent: Friday, January 03, 2003 11:06 PM
+>>To: chang hui; [email protected]
+>>Subject: Re: [Serusers] About B2BUA
+>>
+>>Hello,
+>>
+>>ser is not a B2BUA -- it can act as proxy, redirect, transactional UAS
+>>or registrar. These modes make a vast majority of network scenarios
+>>happy without a need to use a B2BUA. Which is good, because B2BUAs
+>>inherently have certain scalability, reliability and security limitations.
+>>
+>>Is there a particular reason why you would like to use a B2BUA?
+>>
+>>-Jiri
+>>
+>>At 08:00 AM 1/3/2003, chang hui wrote:
+>>>Hi All,
+>>>
+>>>I am newbie of this field, thanks everyone help me.
+>>>I am interesting in B2BUA, however, except some brief defination in 3261, I could not find any further defination or how to implement about B2BUA, I noticed that SER could be implemented as a B2BUA, where can I find some implementation? or where can I get any description?
+>>>

+ 395 - 0
doc/tmemo/tmemo-jiri-media.txt

@@ -0,0 +1,395 @@
+$Id$
+
+
+Draft Distributed Media Server Architecture
+===========================================
+
+Jiri Kuthan, iptel.org, January 2003
+
+
+Abstract
+--------
+
+We describe design considerations made when expanding voicemail 
+application to a more general media server. The objective of
+media server is to bind voice to SIP applications with optional
+support of other tools (SIP SUB/NOT, mysql, TTS, etc.) It has
+to be configurable in such a way it can act in different component
+roles: click-to-dial server, voicemail server, conferencing server, 
+text-to-speech anouncement server, etc.
+
+TOC
+---
+
+Section 1, Scenarios and Component Models, explains background 
+assumptions on how services can be composed using Rosenberg-advocated 
+model. This section is essential to understanding how a media
+server can be plugged-in in a SIP network consisting of multiple
+components, each delivering a part of a complex service. The section 
+also suggests a decentralized architectural improvement for connecting 
+SIP components without a need for a B2BUA, a technology we consider 
+suboptimal.  (This network architecture puts only very little addition
+requirements on the media server.)
+
+
+Section 2, Media Server Requirements, explains basic requirements
+a media server needs to fulful to make a good job in the component
+architecture. Design ideas for server's key part, a programming
+script, are explained in section 3.
+
+Related work, references and example scripts are attached in
+appendices.
+
+
+
+1) Targeted Scenarios and Component Model
+--------------------------------------
+Many application scenarios can provide a pleasant experience to users 
+when users are played explanatory messages or users' voice feedback 
+can affect service logic. That is what media servers are basically
+good for. The whole service logic may be complex and composed of multiple 
+stages (initial anouncement, PIN verification, text-to-speech) which 
+form together a longer conversation. The individual stages may be 
+implemented as parts of a single media server or distributed accross 
+specialized (or specially configured instances of the same) media servers.
+
+Examples of such multi-stage conversations are voicemail, conferencing, 
+click-to-dial, and prepaid calls. Some of these scenarios have been 
+addressed in J. Rosenberg's disseration and an almost identical Internet 
+Draft co-authored by P. Mataga [components]. (See also [featureinteraction]). 
+They proposed a component model, in which a B2BUA faces a caller on its 
+UAS part, and connects to different SIP devices on its UAC part. This 
+B2BUA, so-call call controller, acts as a glue: it connects all possible 
+SIP-enabled application components together. It maintains a "service 
+state machines" which defines how to link components with each other 
+as a session proceeds. It uses HTTP as a complementary protocol for 
+the components to report on their progress to the controller. For example, 
+the controller may first connect on caller's behalf to a "pre-paid prompt 
+component", which queries user's PIN and reports it to the controller. 
+On success, the controller can then hand-off the call to a PSTN gateway.
+
+This architecture is extremelly good in that it introduces distributed 
+components. Decomposition, an imporant design principle, is performed 
+in a fair, peer-2-peer manner that allows linking SIP devices in
+a very flexible way.
+
+The biggest shortcoming of this architecture is imho its central piece, 
+the controller. It is simply too central. A B2BUA design  inherently causes 
+many concerns: security, scalability, and reliability ones. B2BUA solutions 
+proposed in 3pcc draft [3pcc] by Rosenberg have several signaling drawbacks 
+too: tricky media matching (flow III), backwards compatibility
+(flow IV), etc. There is also the economical aspect: a B2BUA
+costs money or development effort.
+
+We believe it is beneficial to avoid such B2BUA constructs. The mechanism
+we are advocating is distributing the service state machine accross 
+participating components. With such a scheme, it is the current component
+that decides what to do next, i.e., when to proceed to which next component.
+A caller contacts an initial component (say a PIN prompting media server) 
+identified by an URI, which is in fact an identifier of the initial service 
+state. An initial conversation is carried out then ("give me your PIN: 
+1-2-3-4"). The component collects the PIN and when finished, it passes 
+over to the next component. There is a choice to verify the PIN in the 
+first component and pass over the final authorization status ("no" or 
+"yes" or "yes but no longer than 5 mintues call") or to pass the PIN 
+and leave its authorization to the next component. 
+
+This construct is more distributed: the controller permanently involved
+in caller's conversation is gone. It is always the current component
+that decides what to do next. There are alway only two parties in 
+a relationship: caller and the current component. "middlebox" B2BUA
+is away.
+
+Another benefit of this more e2e-oriented approach is a better way
+of dealing with caller's preferences. Caller preferences are about the 
+ability to gain user's consent with transitions in conversation -- e.g., 
+is it acceptable for a caller to be transferred to a CIA server? With
+the REFER approach, all transition decisions are actually made
+by client, which is good. Other solutions, in which a downstream
+entity decides on caller's behalf are imho too limiting. They
+require the caller to upload his preferences in a standardized
+format to the upstream client. As the preference space is almost
+infinitely big, the way of standardizing caller's preferences does
+not seem too beneficial to us. There may be always some preferences,
+which the preference format does not capture. Make it simple and
+allow caller to decide on his own behalf. He is responsible, know
+what he wants and possibly does not trust the upstream client
+to interpret his preferences as desired.
+
+Mechanically, the transition to the next component can be easily
+achieved using REFER[refer]. When current component completes, it hints
+caller to proceed to the next one using REFER. The URI in Refer-To 
+represents the next component (a PSTN proxy) as well as some
+service attributes ("pin ok, 5 minutes permitted") with which
+the component can begin. When like in this case the URI carries
+security-sensitive information, the information may be encrypted
+or a message integrity check may be attached. Note that this mechanism
+eliminates a need for the "HTTP reporting hack" in jdr's architecture. 
+Session status is reported in SIP URIs. Cooperating components just 
+need to agree on a scheme for URI usage. That should be easy for SIP 
+servers as URI processing is a primary SIP ability.
+
+A simple application of this more distributed approach is REFER-based 
+"click-to-dial" service. In this scenario, a media component gets somehow 
+instructed to initiate a call. It first calls the first party, optionaly 
+plays a short anouncement ("you will be transfered now") and than transfers 
+this initial call to the other call party. It then completely disappers
+from sebsequent conversation.
+
+The "pre-paid verification component" referred to in this section is another 
+example use of this model. It establishes a call with caller, looks at 
+desired destination, processes PIN in media stream, and makes a decision 
+to hand-over to a gateway. It than disappears from the conversation.
+
+Note that the application call-control framework [ccframework] by Mahy et al. 
+explicitely mentions a more peer-2-peer oriented approach based on REFER as 
+a good alternative to a centralized B2BUA approach. 
+
+
+
+2) Media Server Requirements: Flexibility and Extensibility
+-----------------------------------------------------------
+
+In all such application scenarios, a media component has a central
+role. It plays anouncements, records messages, and interacts with
+caller via signaling too: it can terminate or transfer a call. 
+
+There are two major requirements on its design to make it useful
+for applications as mentioned above: it needs to be flexible 
+and extensible.
+
+Flexibility is desired to be able to configure the media server
+for its particular purpose without having to rewrite it each time. 
+It should be possible to configure whether on receipt of a 
+specific URI, the server plays or records a message. It should 
+be possible to dictate maximum call length and define what happens 
+when the length timer really strikes:  should the call be transferred 
+to another component (and if so, to which) or simply bye-d? Etc.
+
+We suggest, that like in SER this flexibility is achieved
+by a scripting language (see bellow).
+
+The other requirement is exensibility. The media server scripts
+should be able to leverage other available tools. A particular
+example is coupling of script logic with MySql databases --
+feature that made PHP an ultimate success. In context of the
+previous prepaid examples, it can be used to verify user's PIN and
+maximum possible call length. Text-to-speech software such as
+festival [festival], AT&T's Natural Voices [nv] or CMU
+speech software  [cmuspeech] (!!!) including Sphinx, festvox,
+openvxi are examples of other pieces of work worth intergrating
+with.
+
+3) On Scripting Language
+---------------------
+
+scope)
+
+The scripting language should be able to define call processing:
+establish, transfer, terminate a call, provide media processing
+and use external libraries (php, tts, etc.) in an extensible manner.
+It should stay open to integration with Internet services and
+allow things like HTTP queries or SIP instant messaging.
+
+call/transaction abstractions)
+
+The language should hide well protocols detail to make programming
+easy. While access to lower-level features should not be precluded, 
+abstraction and simplicity are the key for application programming. 
+
+The primary living space of the media server programming language
+should be calls. Scripts should be able to deal with calls:
+initiate, terminate and transfer them. ([ccframework] coins
+"replace", "join", "fork").
+
+An important lower-level escape way should be the ability to initiate
+in-call (in-dialogue) transaction. That is what allows the server
+to go beyond simple VoIP/media services. An example of use of
+such an ability would be sending notifications on some events
+(like when a new party joins a multi-party call conference)
+or subscribing to some call-related events:
+  ret=$call.new_transaction("INFO", 
+     "headerfield: value\n\hf2: ".$some_var."\n", "two USD");
+
+
+events)
+
+All of us have agreed that event-oriented approach is a good
+abstraction. The event system should be very universal and
+accept events from a variety of sources in a unified manner.
+The sources include but are not limited to SIP messsages, timers 
+(so that for example voicemail app can set the longest possible 
+recording), external events from local apps (perhaps via FIFO), 
+media events (DTMF), SIP notifications.
+
+There was a proposal too, to introduce notion of SUB/NOT and presence
+to the language. Examples of use are "initiate a conference call when all 
+invited  users are on-line", "repeat a call when called party is
+no longer busy" [dialogpackage], "query participant list in a multi-party
+conversation", etc. We haven't discussed yet whether, and if so
+how such scenarios should be reflected in the language.
+
+requriement summary)
+
+So far, we have identified the following requirements:
+    - programming effectivity (easy and intutitive to use)
+    - parallelism (mutltiple scripts processed at the same time, 
+      multiple calls refered from a single script)     
+    - variables (refering to multiple calls)
+    - event processing
+    - ability to change script without rebooting the server
+    - extensibility (i.e., the ability of the environment to link 
+      external binary libs and refer to them from scripts)
+
+Some design options mentioned so far (nice but not required)
+    - have some casting from input to variables (e.g, $request.header.callid)
+    - use OO -- there are many people for whom OO is easier
+    - exceptions to group error processing
+
+main-loop language)
+
+We have not made any determination yet on whether to resuse an
+existing scripting language (and bind SIP code or any other code
+to it from C/C++ librariries) or design our own from scratch.
+
+Proponents of language reuse (Python may be a reasonable option)
+are primarily concerned about too much unnecessary development 
+and debugging effort for both the basic language and especially 
+for its extensions.
+
+Opponents were concerned about difficulties with integration of
+the scripting languages with code libraries. Other cons are
+bigger image size and dependency on third-party software.
+However, risks of bugs and unability to tweak things are rather 
+low with well-established open-source software like python.
+Possibly, syntax of an own language might better capture
+semantics of the media server.
+
+As said, no determination has been made yet. Author of this
+memo is little a bit uncomfortable with current amount of
+development work put on ser team and hopes that use of an
+off-the-shelve language would save work cycles. (Hopefuly,
+this hope will not be broken by tremendous effort spent
+in integration with supporting libraries.)
+
+
+see more )
+
+Appendixes include pseudo-examples of scripts written in such
+languages. (An XML-based language was discussed too, but its
+proponent gave up on it since it was really big and difficult
+to read.)
+
+
+A) Related Work
+------------
+There has been a whole bunch of related work. Traditional IVRs
+were programmable decades ago. Related technologies include 
+[kpml], [mscl]*, [vxml], Cisco's use of TCL [ciscotcl]. 
+[Bayonne] has some too.  snom uses an xml-based language, 
+there is a voicemail system based on JavaScript and NIST SIP stack.
+
+* one of the differences between kpml and mscml is kpml uses HTTP
+  for reporting (similarly to [components]), MSCML uses SIP
+
+
+B) References
+----------
+[3pcc] http://www.iptel.org/ietf/callprocessing/3pcc/#draft-ietf-sipping-3pcc
+[bayonne] http://www.gnu.org/software/bayonne
+[ciscotcl] http://www.cisco.com/univercd/cc/td/doc/product/access/acs_serv/vapp_dev/tclivrv2/chapter1.htm
+[cmuspeech] http://www.speech.cs.cmu.edu/speech/
+[components] http://www.iptel.org/ietf/callprocessing/apps/#draft-rosenberg-sip-app-components-01
+[ccframework] http://www.iptel.org/ietf/callprocessing/#draft-ietf-sipping-cc-framework
+[dialogpackage] http://www.iptel.org/ietf/callprocessing/#draft-ietf-sipping-dialog-package
+[featureinteraction] http://www.iptel.org/ietf/callprocessing/apps/#draft-rosenberg-sipping-app-interaction-framework
+[festival] http://www.cstr.ed.ac.uk/projects/festival
+[mscml] http://www.iptel.org/ietf/callprocessing/apps/#draft-vandyke-mscml
+[kpml] http://www.iptel.org/ietf/callprocessing/apps/#draft-burger-sipping-kpml
+[nv] http://www.naturalvoices.att.com/
+[refer] http://www.iptel.org/ietf/callprocessing/refer/#draft-ietf-sip-refer
+        (recently approved by IESG for publication as RFC)
+[vxml] http://www.iptel.org/ietf/callprocessing/apps/#draft-rosenberg-sip-vxml-00
+
+
+C) Appendix: pseudo-scripting language
+------------------------------------
+
+/* voicemail */
+event{new_call}(call $c) {
+   $c.play("welcome"); /* play blocking */
+   new_timer(too_long, 200 sec, $c, terminate_call);
+   $c.record("/var/spool/voicemail/"+$c.callee; /* record non blocking */
+}
+event{eo_call}(call $c) {
+   // do nothing; by default, all what has been started is closed 
+}
+event{too_long}(call $c) {
+   $c.terminate();
+}
+
+/* 3pcc a la call transfer */
+event{click_to_dial} (uri $to, uri $from) {
+    $c=new_call("sip:[email protected]" /*our daemon invites caller */, $from /* caller */);
+    $c.play("you will be transfered now");
+    $c.refer($to); /* refer creates an event ... NOTIFY */
+}
+event{notify}(call $c) {
+    /* great, caller has established conversation with the other party --
+       we can hang-up now */
+    $c.terminate();
+}
+
+
+
+D) Appendix: use of python
+-----------------------
+
+
+class App(SIPApplication):
+    def doInvite(req):
+        trans = req.transaction()
+dlg = req.dialog()
+app = dlg.application()
+
+if (req.uri().domain() == "voicemail.org"):
+    try:
+        media = req.sdp.negotiate()
+trans.reply(200)
+    except:
+trans.reply(500)
+
+    file = "/home" + req.uri().username() + "/ann.au"
+    if !file.exists():
+    file = "/ann.au"
+    media < file
+
+    file = "/home" + req.uri().username() + "/msg.au"
+    media.maxlength(200) > file
+
+    def doBye(req):
+        trans = req.transaction()
+trans.reply(200)
+req.dialog().media.stop()
+
+    def doHTTP(req):
+try:
+    dlg = placeCall(req.uri1)
+    dlg.media() < tts("just a moment")
+    dlg.refer(req.referto)
+    dlg.application().click = true
+
+except:
+    log "error"
+
+    def doNotify(req):
+        dlg = req.dialog();
+if dlg.application().click:
+    req.transaction.reply(200)
+    dlg.bye()
+else:
+    req.transaction.reply(...)
+
+    def doTimeout(app):
+        dlg = app.dialog("caller")
+        dlg.bye

+ 274 - 0
doc/tmemo/tmemo-jiri-vmail.txt

@@ -0,0 +1,274 @@
+$Id$
+
+
+Draft Voicemail Architecture
+============================
+
+Jiri Kuthan, iptel.org, January 2003
+
+Abstract
+--------
+
+We describe design decision made when building media 
+support to iptel.org's SIP server suite. We discuss
+how to introduce a voicemail component most effectively,
+i.e., without voicemail programmer's too big involvement
+in SER. We also mention some design choices which
+can be in general made to couple external applications
+with SER.
+
+TOC
+---
+
+We first discuss interfacing methods used between SIP
+server/stack and applications in section 1, interfacing.
+We explain why we chose FIFO for the purpose.
+
+Section 2, IPC, gives details on use of FIFO, call flows
+examples and use of FIFO is detailed in Section 3.
+
+Possible extensions of the FIFO interface are mentioned
+in section 4.
+
+We show how the IPC/FIFO mechanisms compare to CGI-BIN
+which is architecturaly close in Section 5.
+
+1) Interfacing
+--------------
+
+A primary design objective is to hide SIP/SER internals from
+application builders. The SER code is not easy: it includes
+lot of shmem access along with its synchronizations, quite
+dynamic memory use and management. Data structures are rich
+and dynamic. That makes life of an application programmer
+quite difficult and is likely to result in higher bug rate. 
+Thus, it is desirable to decouple application from the stack.
+
+We have considered two approaches: API-based and FIFO-based.
+API-based approach takes a clean encapsulation of parser,
+memory management and other frequently used code in a library.
+The library should take away as much details as possible from
+application developer.
+
+While librarization of SER is a very desirable objective,
+it is a time-expensive task and we do not want it to become
+a road-block for application creation. That's the primary
+argument why we are going FIFO now. 
+
+There were technical arguments related to FIFO use in this
+context too. Some (myself) were arguing that FIFO provides 
+the cleanest separation of applications from ser. It is 
+language-independent, allowing use of effective scripting  
+languages and whatever an app programmer is familiar with. 
+It is no way tied to ser's architecture and the burden of 
+its parallel processing, synchronization, data structures 
+and memory management.
+
+Counter-arguments (by almost anyone else) against FIFO included
+concerns that SER will become too bloated by exporting too
+much of its functionality through FIFO. It is certainly 
+true that a technology may become a victim of its own
+success if it grows too big. SIP itself is unfortunately 
+becoming an example of such technology. 
+
+A demarcation line we agreed to draw was dialog maintenance,
+which shall stay away from SER whereas transaction-related
+stuff will stay in SER.
+
+
+2) IPC
+------
+
+1) voicemail server will not be cranked via fork/exec
+   as it is too expensive. Instead, it will be multi-
+   threaded and await INVITE's via its FIFO server.
+   SER will then dump incoming INVITE requests to
+   voicemail's FIFO server. (non-blocking) A drawback 
+   is that the FIFO server will not be able to inherit 
+   pre-parsed header fields in environment variables.
+
+2) subsequent requests, such as BYE, will take the
+   same FIFO path
+
+3) the external application will communicate with SER
+   using FIFO. For the purpose of replying original
+   INVITEs, there will be a t_fifo_reply command.
+   The command will identify a transaction to be 
+   replied using the pair hash:label. It will be further 
+   parametrized by first reply line, optional header fields and 
+   optional body. (The pair hash:label will have to be
+   communicated via the method described in 1.)
+
+4) to-tags will be generated in the external app.
+   That's a change from previous suggestions. It's
+   a consequence of moving process/thread control
+   from SER to the app. In general, to-tags identify
+   a call and thereby the process/thread associated with
+   it. So the generation of to-tags should be owned by
+   the piece responsible for spawning new processes/threads
+   -- this is the place which will have to dispatch
+   subsequent requests to previously spawned processes.
+
+5) BYE's from voicemail (on timeout) will be done using
+   fifo t_uac. fifo t_uac will have to be changed to
+   allow parametrization of call-id/cseq. (it is ephemeral
+   only now). Call-IDs and Cseq known from previous
+   requests will be passed to SER via FIFO as t_uac
+   parameters.
+
+6) As for CANCEL: the voicemail app doesn't care of it.
+   It is automated and responds immediately, CANCEL is thus
+   not relevant. It is responsibility of the transaction 
+   machine to take care of CANCELs. If they come when the 
+   transaction is still alive, the CANCEL will not affect 
+   the call state, it will be replied with 481 otherwise. 
+   See section 9.2 of RFC 3261 for details.
+
+3) Call Flows and FIFO Use
+---------------------------
+
+a) call setup
+
+---> ... SIP
+===> ... FIFO
+
+UAC               SER            VM
+         INVITE
+         ----->
+         100                          ; 100 generated automatically
+         <---                         ; by t_fifo
+                     t_fifo(INVITE)   ; if request acceptable, VM
+                     =========>       ; stores dialog state indexed by
+                                      ; newly created, unique to-tag and replies
+                     t_reply(200)
+                     <========
+                     200 ok           ; (FIFO/200 means tranaction found and
+                     ========>        ; reply accepted for delivery
+         200
+         <---
+
+b) voicemail terminates call on timer
+
+UAC              SER             VM
+                     t_uac(BYE)      ; VM generates BYE using dialog context
+                     <=========      ; created and stored on receipt of INVITE
+                                     ; (see session 8 in rfc3261, particularly
+                                     ; dealing with RR is tricky)
+        BYE
+        <---
+        200           200
+        ---->        ==========>     ; uac completed -- FIFO returns
+                     
+
+c) caller terminates
+
+UAC              SER              VM
+
+      BYE            t_fifo(BYE)
+     ------>         ==========>     ; VM attempts to look-up the call; on look-up 
+                                     ; failure or if CSeq low, it initiates t_reply 
+                     t_reply(200)    ; with a negative code; otherwise, it completes
+                     <==========     ; recording and confirms the BYE with a 200
+
+--
+use of FIFO:
+
+  t_fifo is a tm action -- it creates a new transaction (t_newtran),
+  sends a provisional 100 back (interaction with media component may
+  take long) and dumps request to a file (presumably media server's
+  FIFO):
+     t_fifo("/tmp/media_fifo", "some parameters");
+  The following items are dumped:
+  - t_fifo (media component may receive other requests)
+  - parameters
+  - to_tag (optimization for a quick dialog look-up)
+  - transaction identification: hash and label (used to refer
+    to transaction when replying)
+  - received requests
+  Eventually, t_fifo sets a timer (otherwise, the application could fail
+  to reply and transaction would be never released).
+
+  t_reply is a FIFO command, which is part of tm too -- it allows external 
+  apps to reply a pending transaction. It is parametrized as follows:
+  - to_tag (to be used if there is no tag in original request; important
+            for looking up dialog for future requests)
+  - transaction identifier: hash, table
+  - code
+  - phrase
+  - optional header fields and body
+  
+4) Possible Extensions of the FIFO interface
+---------------------------------------------
+
+All these extensions are thought to help coupling of external
+applications with ser.
+
+A reasonable alternative for some other applications would be to 
+use exec instead of FIFO for the SER->APP path. That would
+have the benefit of getting header fields conveniently 
+parsed in env vars in the same way like exec module does
+it (since release 0.8.11). Application would then not have to parse 
+header fields.  That however makes only sense if the executed apps are 
+small -- forking is expensive and all is much worse if the started 
+application is big and starts slowly. 
+
+Some other applications may wish to have other triggering
+points than just request receipt. For example, they may wish
+to be triggered on transaction completion (e.g., some
+accounting applications) or receipt of a reply
+(with the possibility to initiate  serial forking,
+for example). That is implementable: transaction completion
+exec can be implemented in a similar way like exec
+module runs and bound to transaction machine via
+a TM callback.  Care needs to be paid to the case
+of exec on reply receipt -- it is called from a callback 
+installed within reply processing mutex, which poses
+some implementation caveats: it has performance
+implications and deadlock potentials. In particular,
+an exec-ed app bound to reply processing could result
+in deadlock if it called FIFO/t_reply.  Also at least
+an evnironment variable describing reply status would
+have to be added, so that the script sees more than
+the original request.
+
+5) SIP CGI-BIN (RFC3050) comparison
+-----------------------------------
+SIP-CGI BIN is a nice mechanism for coupling external applications
+with SIP servers. It is textual, language-independent, separated
+from server processes. From this perspective, it is similar
+to SER's app-coupling mechanisms, which include execution of external 
+applications as in exec module potentialy integrated in TM's transaction 
+management. The reason why SER is somewhat different is of historical 
+nature: we have been trying to address mainstream scenarios with
+compact solutions. They developed in course of time to bigger beasts 
+comparable to SIP CGI-BIN today, but still different. We now try to
+explain how SER/FIFO/exec compares to SIP CGI-BIN. 
+
+   Note that there are some applications, in which ser's FIFO server
+   can be used whereas CGI-BIN is not applicable. Particular, the
+   FIFO server can be used if an application wants to initiate
+   transactions or dialogs. CGI-BIN is only evoked when server
+   (through receipt of a messages) want to run applications.
+
+Also, knowledge of the gaps may be used to implement CGI-BIN for
+SER, if ever wanted.
+
+Similarities:
+- both CGI and t_exec (as suggested in #4) can start external apps
+  on request receipt; retransmissions and other transaction burden
+  is handled by the server
+- both CGI apps and t_exec apps can steer proxy server's 
+  transaction logic; CGI apps do so by returning instructions 
+  on stdout, t_exec apps can do so through FIFO server
+
+SER Defficiencies:
+- enabling applications to remove header fields (CGI permits that)
+  through FIFO currently not possible -- there is no such a FIFO
+  command; should not be difficult to implement
+- request forwarding neither, for the same reason -- no such a FIFO
+  command; easy to change, though
+- the application can be re-execed on receipt of a reply from
+  a reply_route like in CGI BIN; however, there are no meaningful
+  FIFO actions that can be used; use of FIFO/t_reply can result
+  in a deadlock as reply_route is called from within a reply_lock,
+  which is initiated by t_reply called from FIFO server too