Browse Source

tm: Number of fixes in code and documentation for serial forking.

This patch improves the serial forking related code and documentation.
The original code suffers from shortcommits, some of which were result
of the migration to the new core.

The function that decodes destination sets encoded in AVPs has been
improved, because the original version did not handle properly strings
with missing elements. In such case the original implementation was
likely to overwrite memory, because it did not check the return value
of strchr properly. The new implementation tries to handle this
situation. It continues parsing as long as it can, it only requires
that the request-uri string is present, all other fields are made
optional and their variables are properly initialized if their values
cannot be found in the AVP.

There was a bug in the implementation of fr_inv_timer_next modparam in
the original version, changes to the parameter value were ignored by
the serial forking code. This was reported by Andrei and is now fixed
by this commit. The parameter fr_inv_timer_next can now be configured
at runtime with the configuration framework. Its value is in
milliseconds and unlike fr_inv_timer, this timer cannot be configured
separately for individual branches.

Obsolete definition of INV_FR_TIME_OUT_FIRST has been removed. That
macro is not used anywhere in the code, thus it is not needed.

There were several places where LM_DBG printed a string and relied on
the string being zero terminated, this may or may not be true in the
future and this patch uses the macro STR_FMT where appropriate, which
is safer.

Function t_next_contacts now checks if the function decode_branch_info
really returned values for the dst_uri and the path vector. If not then
it calls reset_dst_uri and reset_path_vector. Previous version of the
code crashed sip-router, this is likely due to the merge and updates in
the sip-router core.

We now use t_set_fr in t_next_contacts for setting the fr_inv_timer
value to fr_inv_timer_next. This is much more efficient than creating
AVPs with new timer values. Also the new value of the timer is now
taken from a variable in the configuration framework, instead of just a
regular global variable configured through modparam. This way we can
adjust the value of the timer on the fly. Configuring it through
modparam is, of course, possible too.

The value of of fr_inv_timer_next is now in milliseconds, instead of
seconds. That's the only possibly incompatible change. However, this is
consistent with all other timers in tm module, it is more efficient and
it offers better granularity.

A missing call to destroy_avp has been added to t_next_contacts, in the
code which is executed when no transaction exists. There, the avp
should also be destroyed if all values have been exhausted and none of
them had Q_FLAG set. This is a corner case which should not happen
under normal circumstances, because that situation only happens if all
branches have the same q value. Such AVP would not have been created by
t_load_contacts and therefore t_next_contacts should not be called, but
this bug may be triggered if someone uses t_next_contacts in an
unexpected way and it is probably better to have it fixed.

Also the code which restores the value of fr_inv_timer at the end of
t_next_contacts did not work properly. This patch fixes that. It first
tries to retrieve a value configured with t_set_fr, but that is not
guaranteed to succeed. After that it also tries the timer AVP and
finally the configuration framework. The configuration framework always
yields a value, so we can always restore the timer value, but we may
fail to restore individual transaction timer values set by t_set_fr. If
that fails then the global value from the configuration framework is
used. This is documented as a shortcomming in the README and in the
code.

In addition to code changes this patch also expands documentation on
functions t_load_contacts and t_next_contacts, describing their
operation in more detail. Also the format of the contacts AVP is now
documented.

Finally, there is a whole new section in the README which describes
how serial/parallel forking can be achieved with t_load_contacts and
t_next_contacts and provides a number of examples.
Jan Janak 16 years ago
parent
commit
ed3902e9bd
8 changed files with 851 additions and 320 deletions
  1. 331 190
      modules/tm/README
  2. 3 4
      modules/tm/config.c
  3. 1 2
      modules/tm/config.h
  4. 105 24
      modules/tm/doc/functions.xml
  5. 61 22
      modules/tm/doc/params.xml
  6. 218 0
      modules/tm/doc/tm.xml
  7. 131 77
      modules/tm/t_serial.c
  8. 1 1
      modules/tm/tm.c

File diff suppressed because it is too large
+ 331 - 190
modules/tm/README


+ 3 - 4
modules/tm/config.c

@@ -50,6 +50,7 @@ struct cfg_group_tm	default_tm_cfg = {
 	1,	/* via1_matching */
 	1,	/* via1_matching */
 	FR_TIME_OUT,	/* fr_timeout */
 	FR_TIME_OUT,	/* fr_timeout */
 	INV_FR_TIME_OUT,	/* fr_inv_timeout */
 	INV_FR_TIME_OUT,	/* fr_inv_timeout */
+	INV_FR_TIME_OUT_NEXT, /* fr_inv_timeout_next */
 	WT_TIME_OUT,	/* wait_timeout */
 	WT_TIME_OUT,	/* wait_timeout */
 	DEL_TIME_OUT,	/* delete_timeout */
 	DEL_TIME_OUT,	/* delete_timeout */
 	RETR_T1,	/* rt_t1_timeout */
 	RETR_T1,	/* rt_t1_timeout */
@@ -92,8 +93,6 @@ struct cfg_group_tm	default_tm_cfg = {
 			 * for every method except BYE by default */
 			 * for every method except BYE by default */
 	1,	/* cancel_b_method used for e2e and 6xx cancels*/
 	1,	/* cancel_b_method used for e2e and 6xx cancels*/
 	1,	/* reparse_on_dns_failover */
 	1,	/* reparse_on_dns_failover */
-	INV_FR_TIME_OUT_NEXT, /* fr_inv_timeout_next -> for serial forking subseq.
-							 branches */
 	0 /* disable_6xx, by default off */
 	0 /* disable_6xx, by default off */
 };
 };
 
 
@@ -111,6 +110,8 @@ cfg_def_t	tm_cfg_def[] = {
 	{"fr_inv_timer",	CFG_VAR_INT | CFG_ATOMIC,	0, 0, timer_fixup, 0,
 	{"fr_inv_timer",	CFG_VAR_INT | CFG_ATOMIC,	0, 0, timer_fixup, 0,
 		"timer which hits if no final reply for an INVITE arrives "
 		"timer which hits if no final reply for an INVITE arrives "
 		"after a provisional message was received (in milliseconds)"},
 		"after a provisional message was received (in milliseconds)"},
+	{"fr_inv_timer_next",	CFG_VAR_INT,	0, 0, 0, 0,
+		"The value [ms] of fr_inv_timer for subsequent branches during serial forking."},
 	{"wt_timer",		CFG_VAR_INT | CFG_ATOMIC,	0, 0, timer_fixup, 0,
 	{"wt_timer",		CFG_VAR_INT | CFG_ATOMIC,	0, 0, timer_fixup, 0,
 		"time for which a transaction stays in memory to absorb "
 		"time for which a transaction stays in memory to absorb "
 		"delayed messages after it completed"},
 		"delayed messages after it completed"},
@@ -185,8 +186,6 @@ cfg_def_t	tm_cfg_def[] = {
 		"if set to 1, the SIP message after a DNS failover is "
 		"if set to 1, the SIP message after a DNS failover is "
 		"constructed from the outgoing message buffer of the failed "
 		"constructed from the outgoing message buffer of the failed "
 		"branch instead of from the received request"},
 		"branch instead of from the received request"},
-	{"fr_inv_timer_next",	CFG_VAR_INT,	0, 0, timer_fixup, 0,
-		"The value of fr_inv_timer for subsequent branches during serial forking"},
 	{"disable_6xx_block",	CFG_VAR_INT | CFG_ATOMIC,	0, 1, 0, 0,
 	{"disable_6xx_block",	CFG_VAR_INT | CFG_ATOMIC,	0, 1, 0, 0,
 		"if set to 1, 6xx is treated like a normal reply (breaks rfc)"},
 		"if set to 1, 6xx is treated like a normal reply (breaks rfc)"},
 	{0, 0, 0, 0, 0, 0}
 	{0, 0, 0, 0, 0, 0}

+ 1 - 2
modules/tm/config.h

@@ -54,7 +54,6 @@
 #define INV_FR_TIME_OUT   120000 /* ms */
 #define INV_FR_TIME_OUT   120000 /* ms */
 
 
 /*! \brief final response timers to be used for serial forwarding */
 /*! \brief final response timers to be used for serial forwarding */
-#define INV_FR_TIME_OUT_FIRST 90000 /* ms */
 #define INV_FR_TIME_OUT_NEXT  30000 /* ms */
 #define INV_FR_TIME_OUT_NEXT  30000 /* ms */
 
 
 /* WAIT timer ... tells how long state should persist in memory after
 /* WAIT timer ... tells how long state should persist in memory after
@@ -108,6 +107,7 @@ struct cfg_group_tm {
 	int	via1_matching;
 	int	via1_matching;
 	unsigned int	fr_timeout;
 	unsigned int	fr_timeout;
 	unsigned int	fr_inv_timeout;
 	unsigned int	fr_inv_timeout;
+	unsigned int    fr_inv_timeout_next;
 	unsigned int	wait_timeout;
 	unsigned int	wait_timeout;
 	unsigned int	delete_timeout;
 	unsigned int	delete_timeout;
 	unsigned int	rt_t1_timeout;
 	unsigned int	rt_t1_timeout;
@@ -134,7 +134,6 @@ struct cfg_group_tm {
 	unsigned int	tm_blst_methods_lookup;
 	unsigned int	tm_blst_methods_lookup;
 	unsigned int	cancel_b_flags;
 	unsigned int	cancel_b_flags;
 	int	reparse_on_dns_failover;
 	int	reparse_on_dns_failover;
-	unsigned int fr_inv_timeout_next;
 	int disable_6xx;
 	int disable_6xx;
 };
 };
 
 

+ 105 - 24
modules/tm/doc/functions.xml

@@ -1031,14 +1031,58 @@ failure_route[1] {
 		<function moreinfo="none">t_load_contacts()</function>
 		<function moreinfo="none">t_load_contacts()</function>
 		</title>
 		</title>
 		<para>
 		<para>
-                Loads contacts in destination set in increasing qvalue order as
-                values of contacts_avp. If all contacts in the destination set
-                have the same qvalue, t_load_contacts() does not do
-                anything thus 
-                minimizing performance impact of serial forking capability
-                when it is not needed. Returns 1 if loading of contacts
-                succeeded or there was nothing to do. Returns -1 on error (see
-                syslog).
+		  This is the first of the two functions that can be used to implement
+		  serial/parallel forking based on the q value of individual branches
+		  in a destination set.
+		</para>
+		<para>
+		  The function <function>t_load_contacts()</function> takes all
+		  branches from the current destination set and encodes them into the
+		  AVP whose name or ID is configured with the
+		  parameter <varname>contacts_avp</varname>. Note that you have to
+		  configure this parameter before you can use the function, the
+		  parameter is set to NULL by default, which disables the function.
+		</para>
+		<para>
+		  If the destination set contains only one branch (the Request-URI) or
+		  if all branches have the same q value then the function does nothing
+		  to minimize performance impact. In such case all branches should be
+		  tried in parallel and that is the default mode of operation of
+		  functions like <function>t_relay()</function>, so there is no need
+		  to create the AVP or sort the branches.
+		</para>
+		<para>
+		  If the current destination set contains more than one branch and not
+		  all branches have the same q value then the function sorts them
+		  according to the increasing value of the q parameter. The resulting
+		  sorted list of branches is then encoded into the AVP.
+		</para>
+		<para>
+		  The q parameter contains a value from a range of 0 to 1.0 and it
+		  expresses relative preferrence of the branch among all branches in
+		  the destination set. The higher the q value the more preferrence the
+		  user agent gave to the branch. Branches with higher q values will be
+		  tried first when serial forking takes place.
+		</para>
+		<para>
+		  After that the function clears all branches and you have to
+		  call <function>t_next_contacts</function> to retrieve them sorted
+		  according to their q value. Note that if you
+		  use <function>t_load_contacts</function> then you also have to
+		  use <function>t_next_contacts</function> before
+		  calling <function>t_relay</function>.
+		</para>
+		<para>
+		  The AVP created by the function may contain multiple values, with
+		  one encoded branch per value. The first value will contain the
+		  branch with the highest q value. Each value contains the
+		  Request-URI, the destination URI, the path vector, the outgoing
+		  socket description and branch flags. All these fields are delimited
+		  with the LF character.
+		</para>
+		<para>
+          The function returns 1 if loading of contacts succeeded or there was
+          nothing to do. Returns -1 on error (see syslog).
 		</para>
 		</para>
 		<para>
 		<para>
 		This function can be used from REQUEST_ROUTE.
 		This function can be used from REQUEST_ROUTE.
@@ -1061,24 +1105,61 @@ if (!t_load_contacts()) {
 		<function moreinfo="none">t_next_contacts()</function>
 		<function moreinfo="none">t_next_contacts()</function>
 		</title>
 		</title>
 		<para>
 		<para>
-                If transaction does not exist when t_next_contacts() is
-                called, replaces Request-URI with the
-                first contacts_avp value, adds the remaining contacts_avp values
-                with the same qvalue as branches, and destroys those AVPs. It
-                does nothing if there are no contacts_avp values. Returns 1 if
-                there were no errors and -1 if an error occurred (see syslog).
+		  The function <function>t_next_contacts</function> is the second of
+		  the two functions that can be used to implement serial/parallel
+		  forking based on the q value of individual branches in a destination
+		  set.
+		</para>
+		<para>
+		  This function takes the AVP created
+		  by <function>t_load_contacts</function> and extracts branches with
+		  highest q value from it into the destination set when called for the
+		  first time. When you call the function second time it extracts
+		  branches with lower q value, and so on until all branches have been
+		  extracted.	  
+		</para>
+		<para>
+          If no transaction exist when <function>t_next_contacts()</function>
+          is called, this usually happens when you call the function from a
+          request route block before you call <function>t_relay</function>, it
+          replaces the Request-URI with the first contacts_avp value, adds the
+          remaining contacts_avp values with the same q value as branches, and
+          destroys those AVPs. 
+		</para>
+        <para>
+          If transaction does exist when t_next_contacts() is called, it adds
+          the first <varname>contacts_avp</varname> value and all following
+          <varname>contacts_avp</varname> values with the same q value as new
+          branches to request and destroys those AVPs. 
+        </para>
+		<para>
+		  When you call the function repeatedly from a failure_route branch,
+		  it looks for more AVP values each time and adds branches that have
+		  same q value from the AVP as additional destination set branches. It
+		  always stops when it reaches a branch that has a lower q value. Used
+		  AVP values are always destroyed.,
+		</para>
+		<para>
+		  The function does nothing if there are
+		  no <varname>contact_avp</varname> values.
+		</para>
+		<para>
+          The function returns 1 if there were no errors and -1 if an error
+          occurred (see syslog). This function can be used from REQUEST_ROUTE and FAILURE_ROUTE.
+		</para>
+		<para>
+		  Note that if use use <function>t_load_contacts</function>
+		  and <function>t_next_contacts</function> functions then you should
+		  also set the value of <varname>restart_fr_on_each_reply</varname>
+		  parameter to 0. If you do not do that then it can happen that a
+		  broken user agent that retransmits 180 periodically will keep
+		  resetting the fr_inv_timer value and serial forking never happens.
 		</para>
 		</para>
-                <para>
-                If transaction does exist when t_next_contacts() is
-                called, adds the first contacts_avp value and all
-                following contacts_avp values with the 
-                same qvalue as new branches to request and destroys those AVPs.
-                Returns 1 if new branches were successfully added and -1 on
-                error (see syslog) or if there were no more contacts_avp
-                values.
-                </para>
 		<para>
 		<para>
-		This function can be used from REQUEST_ROUTE and FAILURE_ROUTE.
+		  Also make sure that you
+		  configured <varname>fr_inv_timer_next</varname> with lower value,
+		  especially if you expect to have many serially forked branches. See
+		  the documentation of that parameter for more details.
 		</para>
 		</para>
 		<example>
 		<example>
 		<title><function>t_next_contacts</function> usage</title>
 		<title><function>t_next_contacts</function> usage</title>

+ 61 - 22
modules/tm/doc/params.xml

@@ -744,38 +744,77 @@ onreply_route["stateless_replies"] {
     </section>
     </section>
 
 
 	<section>
 	<section>
-		<title><varname>fr_inv_timer_next</varname> (integer)</title>
-		<para>
-                Value of the Final Response timeout for INVITE
-                transactions to be used during serial forwarding:
-		</para>
-		<para>
-                Function t_next_contacts() sets fr_inv_timer to
-                fr_inv_timer_next value if, after t_next_contacts() is
-                called, there are still lower qvalue contacts available,
-                and to fr_inv_timer value if there are not.
-		</para>
-		<para>
+	  <title><varname>fr_inv_timer_next</varname> (integer)</title>
+	  <para>
+		This parameter can be used to configure an alternative value for the
+		fr_inv_timer timer. This alternative value is used in place
+		of <varname>fr_inv_timer</varname> when serial forking takes place. It
+		is used for all branches during serial forking except the last one.
+		The last branch (or a set of parallel branches) use the original value
+		from <varname>fr_inv_timer</varname> again.
+	  </para>
+	  <para>
+		The purpose of the timer is to allow an administrator to configure a
+		shorter version of <varname>fr_inv_timer</varname> that is used only
+		when serial forking takes place. Forwarding branches one after another
+		is much more time consuming, because every serial branch has to wait
+		for the result of the previous one. That can take up to the value
+		of <varname>fr_inv_timer</varname> and this timer is configured to two
+		minutes by default. Hence, if you have three serial branches then
+		completing the transaction can take six minutes with default timer values.
+	  </para>
+	  <para>
+		In practise, the transaction will be terminated sooner, because the
+		timer <varname>max_inv_lifetime</varname> hits after three minutes.
+		Thus, some of the serial branches might not be forwarded at all. And
+		this is exactly what <varname>fr_inv_timer_next</varname> is for. You
+		can configure the timer to a shorter value to ensure that all serial
+		branches are tried before the
+		timer <varname>max_inv_lifetime</varname> hits.
+	  </para>
+	  <para>
+		Note that if there is only one branch or if the current serial branch
+		is the last one (i.e. no more serial forking takes place after this
+		branch is finished) then <varname>fr_inv_timer_next</varname> is not
+		used, instead the branch uses the
+		longer <varname>fr_inv_timer</varname>.
+	  </para>
+	  <para>
+        Function <function>t_next_contacts()</function>
+        sets <varname>fr_inv_timer</varname>
+        to <varname>fr_inv_timer_next</varname> if serial forking takes place
+        and there is more than one serial branch.
+	  </para>
+	  <para>
+		The administrator can configure the value of this timer using the
+		configuration framework on the fly. But
+		unlike <varname>fr_inv_timer</varname>, it is not possible to
+		configure the value of this timer on per-transaction basis.
+	  </para>
+	  <para>
 		<emphasis>
 		<emphasis>
-			Default value is 30.
+		  The value of this timer is to be specified in milliseconds. The
+		  default value is 30000ms.
 		</emphasis>
 		</emphasis>
-		</para>
-		<example>
+	  </para>
+	  <example>
 		<title>Set <varname>fr_inv_timer_next</varname> parameter</title>
 		<title>Set <varname>fr_inv_timer_next</varname> parameter</title>
 		<programlisting format="linespecific">
 		<programlisting format="linespecific">
 ...
 ...
-modparam("tm", "fr_inv_timer_next", 10)
+modparam("tm", "fr_inv_timer_next", 10000)
 ...
 ...
-</programlisting>
-		</example>
+		</programlisting>
+	  </example>
 	</section>
 	</section>
-
+	
 	<section>
 	<section>
 		<title><varname>contacts_avp</varname> (string)</title>
 		<title><varname>contacts_avp</varname> (string)</title>
 		<para>
 		<para>
-                Internal AVP that t_load_contacts() function uses to store
-                contacts of the destination set and that
-                t_next_contacts() function uses to restore those contacts.
+		  This is the name or Id of an AVP
+                that <function>t_load_contacts()</function> function uses to
+                store contacts of the destination set and that
+                <function>t_next_contacts()</function> function uses to
+                restore those contacts.
 		</para>
 		</para>
 		<para>
 		<para>
 		<emphasis>
 		<emphasis>

+ 218 - 0
modules/tm/doc/tm.xml

@@ -104,6 +104,212 @@
 		it to see if what you are looking for is there.</para>
 		it to see if what you are looking for is there.</para>
 	</note>
 	</note>
     </section>
     </section>
+
+	<section id="tm.serial_forking">
+	  <title>Serial Forking Based on Q Value</title>
+	  <para>
+		A single SIP INVITE request may be forked to multiple destinations. We
+		call the set of all such destinations a destination set. Individual
+		elements within the destination sets are called branches. The script
+		writer can add URIs to the destination set from the configuration
+		file, or they can be loaded from the user location database, each
+		registered contact then becomes one branch in the destination set.
+	  </para>
+	  <para>
+		The default behavior of the tm module, if it encounters a SIP message
+		with multiple branches in the destination set, it to forward the SIP
+		message to all the branches in parallel. That means it sends the
+		message to all the branch destinations before it waits for replies
+		from any of them. This is the default behavior if you
+		call <function>t_relay()</function> and similar functions without
+		anything else.
+	  </para>
+	  <para>
+		Another approach of handling multiple branches in a destination set it
+		serial forking. When configured to do serial forking, the server takes
+		the first branch out of the destination set, forwards the message to
+		its destination and waits for a reply or timeout. Only after a reply
+		has been received or the timeout occurred, the server takes another
+		destination from the destination set and tries again, until it
+		receives a positive final reply or until all branches from the
+		destination set have been tried.
+	  </para>
+	  <para>
+		Yet another, more sophisticated, way of handling multiple branches is
+		combined serial/parallel forking, where individual branches within the
+		destination set are assigned priorities. The order in which individual
+		branches are tried is then determined by their relative priority
+		within the destination set. Branches can be tried sequentially in the
+		descending priority order and all branches that have the same priority
+		can be tried in parallel. Such combined serial/parallel forking can be
+		achieved in the tm module with the help of
+		functions <function>t_load_contacts()</function>
+		and <function>t_next_contacts()</function>.
+	  </para>
+	  <para>
+		Every branch in the destination set is assigned a priority number,
+		also known as the q value. The q value is a floating point number in a
+		range 0 to 1.0. The higher the q value number, the more priority is
+		the particular branch in the destination set is given. Branches with q
+		value 1.0 have maximum priority, such branches should be always tried
+		first in serial forking. Branches with q value 0 have the lowest
+		priority and they should by tried after all other branches with higher
+		priority in the destination set.
+	  </para>
+	  <para>
+		As an example, consider the following simple configuration file. When
+		the server receives an INVITE, it creates four branches for it with
+		usernames A through D and then forwards the request
+		using <function>t_relay()</function>:
+	  </para>
+	  <programlisting format="linespecific">
+route {
+  seturi("sip:[email protected]");
+  append_branch("sip:[email protected]");
+  append_branch("sip:[email protected]");
+  append_branch("sip:[email protected]");
+
+  t_relay();
+  break;
+}
+</programlisting>
+	  <para>
+		With this configuratin the server forwards the request to all four
+		branches at once, performing parallel forking described above. We did
+		not set the q value for individual branches in this example but we can
+		do that by slightly modifying the arguments given
+		to <function>append_branch()</function>:
+	  </para>
+	  <programlisting format="linespecific">
+route {
+  seturi("sip:[email protected]");
+  append_branch("sip:[email protected]", "0.5");
+  append_branch("sip:[email protected]", "0.5");
+  append_branch("sip:[email protected]", "1.0");
+
+  t_relay();
+  break;
+}	 
+</programlisting>
+	  <para>
+		Here we assigned q value 0.5 to branches B and C and q value 1.0 to
+		branch D. We did not specify any q value for branch A and in that case
+		it is assumed that its q value is the lowest from all branches within
+		the destination set. If you try to run this example again, you will
+		figure out that nothing changed, <function>t_relay()</function> still
+		forward the message to all branches in parallel.
+	  </para>
+	  <para>
+		We now want to implement the combined serial/parallel forking. Branch
+		D should be tried first, because its q value is 1.0. Branches B and C
+		should be tried in parallel, but only after D finishes. Branch A
+		should be tried after B and C finished, because its q value (the
+		default) is the lowest of all. To do that, we need to introduce two
+		new functions into our example and one tm module parameter:
+	  </para>
+	  <programlisting format="linespecific">
+modparam("tm", "contacts_avp", "tm_contacts");
+
+route {
+  seturi("sip:[email protected]");
+  append_branch("sip:[email protected]", "0.5");
+  append_branch("sip:[email protected]", "0.5");
+  append_branch("sip:[email protected]", "1.0");
+
+  t_load_contacts();
+
+  t_next_contacts();
+  t_relay();
+  break;
+}	 
+</programlisting>
+	  <para>
+		First of all, the tm module parameter is mandatory if the two new
+		functions are used. Function <function>t_load_contacts()</function>
+		takes all branches from the destination set, sorts them according to
+		their q values and stores them in the AVP configured in the modparam.
+		The function also clears the destination set, which means that it
+		removes all branches configured before
+		with <function>seturi()</function>
+		and <function>append_branch()</function>.
+	  </para>
+	  <para>
+		Function <function>t_next_contacts()</function> takes the AVP created
+		by the previous function and extract the branches with highest q
+		values from it. In our example it is branch D. That branch is then put
+		back into the destination set and when the script finally
+		reaches <function>t_relay()</function>, the destination set only
+		contains branch D and the request will be forwarded there.
+	  </para>
+	  <para>
+		We achieved the first step of serial forking, but this is not
+		sufficient. Now we also need to forward to other branches with lower
+		priority values when branch D finishes. To do that, we need to extend
+		the configuration file again and introduce a failure_route section:
+		</para>
+	  <programlisting format="linespecific">
+modparam("tm", "contacts_avp", "tm_contacts");
+
+route {
+  seturi("sip:[email protected]");
+  append_branch("sip:[email protected]", "0.5");
+  append_branch("sip:[email protected]", "0.5");
+  append_branch("sip:[email protected]", "1.0");
+
+  t_load_contacts();
+
+  t_next_contacts();
+  t_on_failure("serial");
+  t_relay();
+  break;
+}
+
+failure_route["serial"]
+{
+  if (!t_next_contacts()) {
+    exit;
+  }
+
+  t_on_failure("serial");
+  t_relay();
+}
+</programlisting>
+	  <para>
+		The failure_route section will be executed when branch D finishes. It
+		executes <function>t_next_contacts()</function> again and this time
+		the function retrieves branches B and C from the AVP and adds them to
+		the destination set. Here we need to check the return value of the
+		function, because a negative value indicates that there were no more
+		branches, in that case the failure_route should just terminate and
+		forward the response from branch D upstream.
+	  </para>
+	  <para>
+		If <function>t_next_contact()</function> returns a positive value then
+		we have more new branches to try and we need to setup the
+		failure_route again and call <function>t_relay()</function>. In our
+		example the request will now be forwarded to branches B and C in
+		paralell, because they were both added to the destination set
+		by <function>t_next_contacts()</function> at the same time.
+	  </para>
+	  <para>
+		When branches B and C finish, the failure_route block is executed
+		again, this time <function>t_next_contacts()</function> puts the final
+		branch A into the destination set and <function>t_relay()</function>
+		forwards the request there.
+	  </para>
+	  <para>
+		And that's the whole example, we achieved combined serial/parallel
+		forking based on the q value of individual branches. In real-world
+		configuration files the script writer would need to check the return
+		value of all functions and also configure some additional parameters,
+		such as <varname>fr_inv_timer_next</varname>
+		and <varname>restart_fr_on_each_reply</varname>. Also the destination
+		set would not be configured directly in the configuration file, but
+		can be retrieved from the user location database, for example. In that
+		case registered contacts will be stored in the destination set as
+		branches and their q values (provided by UAs) will be used.
+	  </para>
+	</section>
     
     
     <section id="tm.known_issues">
     <section id="tm.known_issues">
 	<title>Known Issues</title>
 	<title>Known Issues</title>
@@ -135,6 +341,18 @@
 		    longer so much as there is a new replication module).
 		    longer so much as there is a new replication module).
 		</para>
 		</para>
 	    </listitem>
 	    </listitem>
+	    <listitem>
+		<para>
+		    Function <function>t_next_contacts</function> should restore the
+		    value of timer <varname>fr_inv_timer</varname> when there are no
+		    more branches to be processed serially. The function can restore
+		    the timer properly if it has been configured through an AVP or
+		    with the config framework, but may fail to restore timer values
+		    configured for individual transactions
+		    with <function>t_set_fr</function>.
+		</para>
+	    </listitem>
+
 	</itemizedlist>
 	</itemizedlist>
     </section>
     </section>
     
     

+ 131 - 77
modules/tm/t_serial.c

@@ -42,9 +42,6 @@
 /* usr_avp flag for sequential forking */
 /* usr_avp flag for sequential forking */
 #define Q_FLAG      (1<<2)
 #define Q_FLAG      (1<<2)
 
 
-/* module parameter variable */
-int fr_inv_timer_next = INV_FR_TIME_OUT_NEXT;
-
 /* Struture where information regarding contacts is stored */
 /* Struture where information regarding contacts is stored */
 struct contact {
 struct contact {
     str uri;
     str uri;
@@ -118,72 +115,100 @@ static inline int decode_branch_info(char *info, str *uri, str *dst, str *path,
     int port, proto;
     int port, proto;
     char *pos, *at, *tmp;
     char *pos, *at, *tmp;
 
 
-    pos = strchr(info, '\n');
-    uri->len = pos - info;
-    if (uri->len) {
-		uri->s = info;
-    } else {
-		uri->s = 0;
+	if (info == NULL) {
+		ERR("decode_branch_info: Invalid input string.\n");
+		return 0;
+	}
+	
+	/* Reset or return arguments to sane defaults */
+	uri->s = 0; uri->len = 0;
+	dst->s = 0; dst->len = 0;
+	path->s = 0; path->len = 0;
+	*sock = NULL;
+	*flags = 0;
+	
+	/* Make sure that we have at least a non-empty URI string, it is fine if
+	 * everything else is missing, but we need at least the URI. */
+	uri->s = info; 
+	if ((pos = strchr(info, '\n')) == NULL) { 
+		uri->len = strlen(info); 
+			/* We don't even have the URI string, this is bad, report an
+			 * error. */
+		if (uri->len == 0) goto uri_missing;
+		return 1;
     }
     }
-    at = pos + 1;
-
-    pos = strchr(at, '\n');
-    dst->len = pos - at;
-    if (dst->len) {
-		dst->s = at;
-    } else {
-		dst->s = 0;
+	uri->len = pos - info;
+	if (uri->len == 0) goto uri_missing;
+
+	/* If we get here we have at least the branch URI, now try to parse as
+	 * much as you can. All output variable have been initialized above, so it
+	 * is OK if any of the fields are missing from now on. */
+    dst->s = at = pos + 1;
+    if ((pos = strchr(at, '\n')) == NULL) {
+		dst->len = strlen(dst->s);
+		return 1;
     }
     }
-    at = pos + 1;
+	dst->len = pos - at;
 
 
-    pos = strchr(at, '\n');
-    path->len = pos - at;
-    if (path->len) {
-		path->s = at;
-    } else {
-		path->s = 0;
+    path->s = at = pos + 1;
+    if ((pos = strchr(at, '\n')) == NULL) {
+		path->len = strlen(path->s);
+		return 1;
     }
     }
-    at = pos + 1;
+    path->len = pos - at;
 
 
-    pos = strchr(at, '\n');
-    s.len = pos - at;
-    if (s.len) {
-		s.s = at;
+    s.s = at = pos + 1;
+    if ((pos = strchr(at, '\n')) == NULL) {
+		/* No LF found, that means we take the string till the final zero
+		 * termination character and pass it directly to parse_phostport
+		 * without making a zero-terminated copy. */
+		tmp = s.s;
+		s.len = strlen(s.s);
+	} else {
+		/* Our string is terminated by LF, so we need to make a
+		 * zero-terminated copy of the string before we pass it to
+		 * parse_phostport. */
+		s.len = pos - at;
 		if ((tmp = as_asciiz(&s)) == NULL) {
 		if ((tmp = as_asciiz(&s)) == NULL) {
 			ERR("No memory left\n");
 			ERR("No memory left\n");
 			return 0;
 			return 0;
 		}
 		}
+	}	
+	if (s.len) {
 		if (parse_phostport(tmp, &host.s, &host.len,
 		if (parse_phostport(tmp, &host.s, &host.len,
 							&port, &proto) != 0) {
 							&port, &proto) != 0) {
-			LM_ERR("parsing of socket info <%.*s> failed\n",  s.len, s.s);
-			pkg_free(tmp);
+			LM_ERR("parsing of socket info <%s> failed\n", tmp);
+			if (pos) pkg_free(tmp);
 			return 0;
 			return 0;
 		}
 		}
-		pkg_free(tmp);
+
 		*sock = grep_sock_info(&host, (unsigned short)port,
 		*sock = grep_sock_info(&host, (unsigned short)port,
 							   (unsigned short)proto);
 							   (unsigned short)proto);
 		if (*sock == 0) {
 		if (*sock == 0) {
-			LM_ERR("invalid socket <%.*s>\n", s.len, s.s);
+			LM_ERR("invalid socket <%s>\n", tmp);
+			if (pos) pkg_free(tmp);
 			return 0;
 			return 0;
 		}
 		}
-    } else {
-		*sock = 0;
-    }
-    at = pos + 1;
+	}
+	
+	if (pos) pkg_free(tmp);
+	else return 1;
+	
+    s.s = at = pos + 1;
+    if ((pos = strchr(at, '\n')) == NULL) s.len = strlen(s.s);
+    else s.len = pos - s.s;
 
 
-    pos = strchr(at, '\n');
-    s.len = pos - at;
     if (s.len) {
     if (s.len) {
-		s.s = at;
 		if (str2int(&s, flags) != 0) {
 		if (str2int(&s, flags) != 0) {
-			LM_ERR("failed to decode flags <%.*s>\n", s.len, s.s);
+			LM_ERR("failed to decode flags <%.*s>\n", STR_FMT(&s));
 			return 0;
 			return 0;
 		}
 		}
-    } else {
-		*flags = 0;
     }
     }
-
     return 1;
     return 1;
+
+uri_missing:
+	ERR("decode_branch_info: Cannot decode branch URI.\n");
+	return 0;
 }
 }
 
 
 
 
@@ -349,7 +374,7 @@ rest:
 				contacts_avp, val);
 				contacts_avp, val);
 		pkg_free(branch_info.s);
 		pkg_free(branch_info.s);
 		LM_DBG("loaded contact <%.*s> with q_flag <%d>\n",
 		LM_DBG("loaded contact <%.*s> with q_flag <%d>\n",
-			   val.s.len, val.s.s, curr->q_flag);
+			   STR_FMT(&val.s), curr->q_flag);
 		curr = curr->next;
 		curr = curr->next;
     }
     }
 
 
@@ -379,6 +404,8 @@ int t_next_contacts(struct sip_msg* msg, char* key, char* value)
     unsigned int flags;
     unsigned int flags;
     struct cell *t;
     struct cell *t;
 	struct search_state st;
 	struct search_state st;
+	ticks_t orig;
+	unsigned int avp_timeout;
 
 
     /* Check if contacts_avp has been defined */
     /* Check if contacts_avp has been defined */
     if (contacts_avp.n == 0) {
     if (contacts_avp.n == 0) {
@@ -405,45 +432,44 @@ int t_next_contacts(struct sip_msg* msg, char* key, char* value)
 			return 1;
 			return 1;
 		}
 		}
 
 
-		LM_DBG("next contact is <%s>\n", val.s.s);
+		LM_DBG("next contact is <%.*s>\n", STR_FMT(&val.s));
 
 
 		if (decode_branch_info(val.s.s, &uri, &dst, &path, &sock, &flags)
 		if (decode_branch_info(val.s.s, &uri, &dst, &path, &sock, &flags)
 			== 0) {
 			== 0) {
-			LM_ERR("decoding of branch info <%.*s> failed\n",
-				   val.s.len, val.s.s);
+			LM_ERR("decoding of branch info <%.*s> failed\n", STR_FMT(&val.s));
 			destroy_avp(avp);
 			destroy_avp(avp);
 			return -1;
 			return -1;
 		}
 		}
 
 
 		/* Rewrite Request-URI */
 		/* Rewrite Request-URI */
 		rewrite_uri(msg, &uri);
 		rewrite_uri(msg, &uri);
-		set_dst_uri(msg, &dst);
-		set_path_vector(msg, &path);
+		if (dst.s && dst.len) set_dst_uri(msg, &dst);
+		else reset_dst_uri(msg);
+		if (path.s && path.len) set_path_vector(msg, &path);
+		else reset_path_vector(msg);
 		msg->force_send_socket = sock;
 		msg->force_send_socket = sock;
 		setbflagsval(0, flags);
 		setbflagsval(0, flags);
 
 
 		if (avp->flags & Q_FLAG) {
 		if (avp->flags & Q_FLAG) {
 			destroy_avp(avp);
 			destroy_avp(avp);
 			/* Set fr_inv_timer */
 			/* Set fr_inv_timer */
-			val.n = fr_inv_timer_next;
-			if (add_avp(fr_inv_timer_avp_type, fr_inv_timer_avp, val) != 0) {
-				LM_ERR("setting of fr_inv_timer_avp failed\n");
+			if (t_set_fr(msg, cfg_get(tm, tm_cfg, fr_inv_timeout_next), 0) 
+				== -1) {
+				ERR("Cannot set fr_inv_timer value.\n");
 				return -1;
 				return -1;
 			}
 			}
 			return 1;
 			return 1;
 		}
 		}
-
+		
 		/* Append branches until out of branches or Q_FLAG is set */
 		/* Append branches until out of branches or Q_FLAG is set */
 		prev = avp;
 		prev = avp;
 		while ((avp = search_next_avp(&st, &val))) {
 		while ((avp = search_next_avp(&st, &val))) {
 			destroy_avp(prev);
 			destroy_avp(prev);
-
-			LM_DBG("next contact is <%s>\n", val.s.s);
+			LM_DBG("next contact is <%.*s>\n", STR_FMT(&val.s));
 
 
 			if (decode_branch_info(val.s.s, &uri, &dst, &path, &sock, &flags)
 			if (decode_branch_info(val.s.s, &uri, &dst, &path, &sock, &flags)
 				== 0) {
 				== 0) {
-				LM_ERR("decoding of branch info <%.*s> failed\n",
-					   val.s.len, val.s.s);
+				LM_ERR("decoding of branch info <%.*s> failed\n", STR_FMT(&val.s));
 				destroy_avp(avp);
 				destroy_avp(avp);
 				return -1;
 				return -1;
 			}
 			}
@@ -456,35 +482,30 @@ int t_next_contacts(struct sip_msg* msg, char* key, char* value)
 
 
 			if (avp->flags & Q_FLAG) {
 			if (avp->flags & Q_FLAG) {
 				destroy_avp(avp);
 				destroy_avp(avp);
-				val.n = fr_inv_timer_next;
-				if (add_avp(fr_inv_timer_avp_type, fr_inv_timer_avp, val)
-					!= 0) {
-					LM_ERR("setting of fr_inv_timer_avp failed\n");
+				/* Set fr_inv_timer */
+				if (t_set_fr(msg, cfg_get(tm, tm_cfg, fr_inv_timeout_next), 0) == -1) {
+					ERR("Cannot set fr_inv_timer value.\n");
 					return -1;
 					return -1;
 				}
 				}
 				return 1;
 				return 1;
 			}
 			}
 			prev = avp;
 			prev = avp;
 		}
 		}
-	
+		destroy_avp(prev);
     } else {
     } else {
-	
-		/* Transaction exists => only load branches */
+			/* Transaction exists => only load branches */
 
 
 		/* Find first contacts_avp value */
 		/* Find first contacts_avp value */
 		avp = search_first_avp(contacts_avp_type, contacts_avp, &val, &st);
 		avp = search_first_avp(contacts_avp_type, contacts_avp, &val, &st);
 		if (!avp) return -1;
 		if (!avp) return -1;
 
 
 		/* Append branches until out of branches or Q_FLAG is set */
 		/* Append branches until out of branches or Q_FLAG is set */
-		prev = avp;
 		do {
 		do {
-
-			LM_DBG("next contact is <%s>\n", val.s.s);
+			LM_DBG("next contact is <%.*s>\n", STR_FMT(&val.s));
 
 
 			if (decode_branch_info(val.s.s, &uri, &dst, &path, &sock, &flags)
 			if (decode_branch_info(val.s.s, &uri, &dst, &path, &sock, &flags)
 				== 0) {
 				== 0) {
-				LM_ERR("decoding of branch info <%.*s> failed\n",
-					   val.s.len, val.s.s);
+				LM_ERR("decoding of branch info <%.*s> failed\n", STR_FMT(&val.s));
 				destroy_avp(avp);
 				destroy_avp(avp);
 				return -1;
 				return -1;
 			}
 			}
@@ -503,16 +524,49 @@ int t_next_contacts(struct sip_msg* msg, char* key, char* value)
 			prev = avp;
 			prev = avp;
 			avp = search_next_avp(&st, &val);
 			avp = search_next_avp(&st, &val);
 			destroy_avp(prev);
 			destroy_avp(prev);
-
 		} while (avp);
 		} while (avp);
 
 
-		/* Restore fr_inv_timer */
-		val.n = default_tm_cfg.fr_inv_timeout;
-		if (add_avp(fr_inv_timer_avp_type, fr_inv_timer_avp, val) != 0) {
-			LM_ERR("setting of fr_inv_timer_avp failed\n");
-			return -1;
+		/* If we got there then we have no more branches for subsequent serial
+		 * forking and the current set is the last one. For the last set we do
+		 * not use the shorter timer fr_inv_timer_next anymore, instead we use
+		 * the usual fr_inv_timer.
+		 *
+		 * There are three places in sip-router which can contain the actual
+		 * value of the fr_inv_timer. The first place is the variable
+		 * use_fr_inv_timeout defined in timer.c That variable is set when the
+		 * script writer calls t_set_fr in the script. Its value can only be
+		 * used from within the process in which t_set_fr was called. It is
+		 * not guaranteed that when we get here we are still in the same
+		 * process and therefore we might not be able to restore the correct
+		 * value if the script writer used t_set_fr before calling
+		 * t_next_contacts. If that happens then the code below detects this
+		 * and looks into the AVP or cfg framework for other value. In other
+		 * words, t_next_contact does not guarantee that fr_inv_timer values
+		 * configured on per-transaction basis with t_set_fr will be correctly
+		 * restored.
+		 *
+		 * The second place is the fr_inv_timer_avp configured in modules
+		 * parameters. If that AVP exists and then its value will be correctly
+		 * restored by t_next_contacts. The AVP is an alternative way of
+		 * configuring fr_inv_timer on per-transaction basis, it can be used
+		 * interchangeably with t_set_fr. Function t_next_contacts always
+		 * correctly restores the timer value configured in the AVP.
+		 *
+		 * Finally, if we can get the value neither from user_fr_inv_timeout
+		 * nor from the AVP, we turn to the fr_inv_timeout variable in the cfg
+		 * framework. This variable contains module's default and it always
+		 * exists and is available. */
+		orig = (ticks_t)get_msgid_val(user_fr_inv_timeout, msg->id, int);
+		if (orig == 0) {
+			if (!fr_inv_avp2timer(&avp_timeout)) {
+				/* The value in the AVP is in seconds and needs to be
+				 * converted to ticks */
+				orig = S_TO_TICKS((ticks_t)avp_timeout);
+			} else {
+				orig = cfg_get(tm, tm_cfg, fr_inv_timeout);
+			}
 		}
 		}
-	
+		change_fr(t, orig, 0);
     }
     }
 
 
     return 1;
     return 1;

+ 1 - 1
modules/tm/tm.c

@@ -481,6 +481,7 @@ static param_export_t params[]={
 	{"via1_matching",       PARAM_INT, &default_tm_cfg.via1_matching         },
 	{"via1_matching",       PARAM_INT, &default_tm_cfg.via1_matching         },
 	{"fr_timer",            PARAM_INT, &default_tm_cfg.fr_timeout            },
 	{"fr_timer",            PARAM_INT, &default_tm_cfg.fr_timeout            },
 	{"fr_inv_timer",        PARAM_INT, &default_tm_cfg.fr_inv_timeout        },
 	{"fr_inv_timer",        PARAM_INT, &default_tm_cfg.fr_inv_timeout        },
+	{"fr_inv_timer_next",   PARAM_INT, &default_tm_cfg.fr_inv_timeout_next   },
 	{"wt_timer",            PARAM_INT, &default_tm_cfg.wait_timeout          },
 	{"wt_timer",            PARAM_INT, &default_tm_cfg.wait_timeout          },
 	{"delete_timer",        PARAM_INT, &default_tm_cfg.delete_timeout        },
 	{"delete_timer",        PARAM_INT, &default_tm_cfg.delete_timeout        },
 	{"retr_timer1",         PARAM_INT, &default_tm_cfg.rt_t1_timeout         },
 	{"retr_timer1",         PARAM_INT, &default_tm_cfg.rt_t1_timeout         },
@@ -514,7 +515,6 @@ static param_export_t params[]={
 	{"cancel_b_method",     PARAM_INT, &default_tm_cfg.cancel_b_flags},
 	{"cancel_b_method",     PARAM_INT, &default_tm_cfg.cancel_b_flags},
 	{"reparse_on_dns_failover", PARAM_INT, &default_tm_cfg.reparse_on_dns_failover},
 	{"reparse_on_dns_failover", PARAM_INT, &default_tm_cfg.reparse_on_dns_failover},
 	{"on_sl_reply",         PARAM_STRING|PARAM_USE_FUNC, fixup_on_sl_reply   },
 	{"on_sl_reply",         PARAM_STRING|PARAM_USE_FUNC, fixup_on_sl_reply   },
-	{"fr_inv_timer_next",   PARAM_INT, &default_tm_cfg.fr_inv_timeout_next   },
 	{"contacts_avp",        PARAM_STRING, &contacts_avp_param                },
 	{"contacts_avp",        PARAM_STRING, &contacts_avp_param                },
 	{"disable_6xx_block",   PARAM_INT, &default_tm_cfg.disable_6xx           },
 	{"disable_6xx_block",   PARAM_INT, &default_tm_cfg.disable_6xx           },
 	{0,0,0}
 	{0,0,0}

Some files were not shown because too many files changed in this diff