techdoc/protocol.pod

   1 # -*- perl -*-
   2 =head1 NAME
   3
   4 Aranea Orthogonal Communications Protocol
   5
   6 $Revision$
   7
   8 =head1 SYNOPSIS
   9
  10  <Origin>,<Group>,<TimeSeq>,<Hop>[,<From>]|<Tag>,<Data>...
  11
  12 =head1 ABSTRACT
  13
  14 For many years DX Clusters have used a protocol which was designed
  15 for a non-looped tree ofL</Node>s. This environment has probably never, reliably,
  16 been achieved in practice; certainly not recently.
  17
  18 There have always been loops, sometimes bringing the network to its
  19 knees. In modern usage, both in order to get some resilience and also
  20 to expedite information flow, we use internet based, deliberately
  21 looped networks with filtering. Whilst this works, after a fashion, there
  22 are all sorts of problems that the current PC protocol can never
  23 address.
  24
  25 This document
  26 describes a complete replacement for the PC protocol. It allows a
  27 fully looped network, is inherently extensible and should be simple
  28 to implement (especially in perl).
  29
  30 All implementations shall use b<only> this protocol
  31 for inter-node communications.
  32
  33 =head1 DESCRIPTION
  34
  35 This protocol is
  36 designed to be an extensible basis for any type of one -> many
  37 "instant" line-based communications tasks.
  38
  39 The protocol is designed to be flood routed in a meshed network in
  40 as efficient a manner as possible. The reason we have chosen this
  41 mechanism is that most L</Messages> need to be broadcast to allL</Node>s.
  42
  43 Experience has shown thatL</Node>s will appear and (more infrequently)
  44 disappear without much (or any) notice.
  45 Therefore, the constantly changing and uncoordinated
  46 nature of the network doesn't lend itself to fixed routing policies. Therefore,
  47 whilst metrics and routing tables (more like routing hint tables) will be
  48 built up over time, an aggressive aging algorithm will also be employed to prevent
  49 a lot of stale routing information being retained.
  50
  51 Having said that: where routes have
  52 been learned through past traffic, and this data is recent, then direct routing should be used.
  53 Those L</Messages> that could be routed (likely to be mainly single line one to
  54 one "talk" L</Messages>) that, anyway,
  55 happen sufficiently infrequently that, should they need to be flood routed
  56 (because no route has been learned yet), it is a small cost overall.
  57
  58 =head1 Messages
  59
  60 A message is a single line of UTF8 encoded and HTTP escaped text
  61 terminated in the standard internet manner with a <CR><LF>.
  62
  63 Each message consists of a L</Routing Section> and a L</Command Section>.
  64 The two sections are separated with the '|' character.
  65 It follows that these
  66 characters (as well as non-printable characters, <CR>, <LF> and
  67 a small number of other reserved characters)
  68 can only be sent escaped. This is described further in the
  69 L</Command Section> and L</Fields>.
  70
  71 Most of this document is concerned with the L</Routing Section>, however
  72 some L</Standard Commands> which all implementations should issue and
  73 must accept are described.
  74
  75 =head1 Applications
  76
  77 In the past messaging applications such as DX Cluster software have maintained
  78 a fairly strict division between L</Node>s and L</User>s. This protocol attempts
  79 to get away from that by deliberately blurring (or, in some cases, removing)
  80 any distinction between the two.
  81
  82 Applications that use this protocol are essentially all peers and therefore the
  83 only real difference between L</Node>s and L</User>s is that a node has one or more
  84 listeners running that will,
  85 potentially, allow incoming connections from other L</Node>s, L</Endpoint>s or L</User>s. These
  86 routable entities are called L</Terminal>s.
  87
  88 Any application that is a sink and/or source of data; is capable of obeying
  89 the protocol message construction rules and understands how to deduplicate incoming messages
  90 correctly can operate as a routeable entity or L</Terminal> in this protocol. It is called an L</Endpoint>.
  91
  92 An L</Endpoint> is called a L</Node> if it accepts connections from L</Endpoint>s and is
  93 prepared to route messages on their behalf to other L</Node>s or L</Endpoint>s. In addition it
  94 may provide some other, usually simpler, interface (eg simple telnet access) for direct user access. Acting
  95 in the protocol, on their behalf.
  96
  97 The concept of an L</Endpoint> has been invented because modern clients are
  98 capable of being more intelligent than simple
  99 character based connections such as telnet or ax25. They also wish to be able to
 100 distinguish between the various classes of message, such as: DX spots,
 101 announces, talk, logging info etc. It is a pain to have to do it, as now,
 102 by trying to make sense of the (slightly different for each piece of node
 103 software) human readable "user" version of the output. Far better to pass on
 104 regular, specified, easily computer decodable versions of the message,
 105 (i.e. in this protocol) and leave
 106 the human presentation to the application.
 107
 108 It also helps to modularise the various interfaces that may be implemented such
 109 as the  legacy, character based connections of existing PC protocol based nodes.
 110 They should be treated
 111 as local clients, in fact as L</User>s, B<not> as peers in this protocol. It is likely that, in order
 112 to do this, some extra L</Tag>s will need to be defined at application level.
 113
 114 =head1 Definitions
 115
 116 In this document we use a number of terms that need to be defined.
 117
 118 =head2 Terminal
 119
 120 A L</Terminal> is a routable entity, in other words: a callsign or service that can be routed
 121 to, that lives at one or a few L</Node>s.
 122
 123 =head2 User
 124
 125 A L</User> is a connection to a L</Node> (that allows such connections)
 126 that does not occur in protocol. All L</User>s shall be identified with a name
 127 of up to 12 characters in the set [-0-9A-Z_]. All messages have to be routed via the
 128 L</Node> to which this L</User> is connected.
 129
 130 =head2 Endpoint
 131
 132 An L</Endpoint> is a connection to a L</Node> that uses the protocol. From a routing point of
 133 view, it is indistiguishable from a L</Node>. The L</Endpoint> is responsible for creating and decoding
 134 well formed protocol messages. An L</Endpoint> does not route beyond the immediate L</Node>(s) to
 135 which it is connected. It may also be a L</Service> connected to a L</Node> which provides some
 136 addressable service (such as a database) that can be queried.
 137
 138 =head2 Node
 139
 140 A L</Node> is connected to other L</Node>s. It is responsible for routing messages in protocol
 141 from other L</Node>s or L</Endpoint>s, whether directly connected or not. Optionally, a L</Node>
 142 may provide other interfaces, such as direct L</User> connections or legacy PC protocol speaking
 143 DX Clusters.
 144
 145 =head2 Channel
 146
 147 A L</Channel> is a L</Group> address that is not a L</Terminal>. It is (unless qualified by a L</Terminal>)
 148 broadcast on all a L</Node>s interfaces unless preventing by some filtering or other local policy on
 149 that L</Node>.
 150
 151 =head2 Service
 152
 153 A L</Service> is application that either plugs into or connects as an L</Endpoint> to a L</Node>. It is an
 154 application that, in effect, is a database. In other words: queries are sent to the L</Service> and it sends
 155 back a reply.
 156
 157 =head1 Routing Section
 158
 159 The application that implements this protocol is essentially a line
 160 oriented message router. One line equals one message. Each line is
 161 effectively a datagram.
 162
 163 It is assumed thatL</Node>s are connected to
 164 each other using a "reliable" streaming protocol such as TCP/IP or
 165 AX25. Having said that: in context, L</Messages> in this protocol could be
 166 multi/broadcast, either "as is" or wrapped in some other framing
 167 protocol.
 168
 169 Although the physical transport between L</Node>s is reliable, the actual message
 170 is unreliable, because this is an unreliable, best effort, "please route my packets
 171 through your node" protocol. There is no guarantee that a message
 172 will get to the other side of a mesh of L</Node>s. There may be a
 173 discontinuity either caused by outage or deliberate filtering.
 174
 175 However, as it is envisaged that most L</Messages> will be flood routed or,
 176 in the case of directed L</Messages> (those that have a L</Group> containing a L</Terminal> of some kind)
 177 down some/most/all interfaces showing a route for that
 178 direction, it is unlikely that L</Messages> will be lost in practice.
 179
 180 Assuming that there is a path between all the L</Node>s in a network, then it is guaranteed
 181 that a message will be delivered everywhere, eventually. It is possible (indeed likely) that
 182 copies of  a message
 183 will arrive at L</Node>s more than once. L</Node>s are responsible for deduplicating those messages
 184 using the information in the L</Routing Section>.
 185
 186 =head2 Field Description
 187
 188 All fields in the L</Routing Section> are compulsory except the L</From> field. If it is missing
 189 so is the separating comma.
 190
 191 The L</Hop> field is incremented on receipt of a message on a node.
 192
 193 Fields are separated by the comma ',' character with the last field
 194 required followed by the vertical bar '|' character.
 195
 196 The characters allowed in the routing section are restricted. Any
 197 invalid characters in any field will cause the whole message to be
 198 silently dropped.
 199
 200 More detailed descriptions of the fields follow:
 201
 202 =over
 203
 204 =item B<Origin>
 205
 206 This is a compulsory field. It is the name of the originating node.
 207 The field can contain up to 12 characters in the set [-A-Z0-9_/] in
 208 any order. Higher layers may restrict this further.
 209
 210 The field must not be changed by any other node.
 211
 212 =item B<Group>
 213
 214 This is the Group (or Channel) to be used for this data. It is compulsory.
 215
 216 It is a string of up to 12 characters
 217 in the set [-A-Z0-9_/] in any order.
 218
 219 Optionally, for extra routing to
 220 a particular L</Terminal> connected at a specific L</Node>, or even a
 221 particular L</Terminal> in a L</Group>,
 222 it may have another 12 character
 223 string in the same set, concatenated with the first string. The two strings are separated by a ':'
 224 character. For example:
 225
 226   DX                        # the DX group
 227   GB7DJK                    # the node GB7DJK
 228   G1TLH                     # the user or endpoint G1TLH
 229   GB7DJK:G1TLH              # the user G1TLH at GB7DJK
 230   DX:G1TLH                  # the user G1TLH in the DX group
 231
 232 This field can contain either a L</Terminal> or some other string which is interpreted
 233 as broadcastable group address. Any message that has a L</Group> that is not recognised as a L</Terminal> must
 234 be broadcast.
 235
 236 This means that messages to callsigns, for whom no specific routing information is available,
 237 will be found by means of a broadcast. Hopefully this will cause some kind of activity o.b.o
 238 that callsign will allow routing tables to be gathered that narrow down the scope of any future
 239 message to that callsign through the network.
 240
 241 Remember that not all L</Node>s may pass every L</Group> field, depending on local policy.
 242
 243 =item B<TimeSeq>
 244
 245 This is a compulsory field. It is a 10 hexadecimal digit string which
 246 consists of a day no (1-31),
 247 a flag to indicate NTP syncronisation in use,
 248 seconds within that day (0-86399) [total of 6 hex digits]
 249 that are concatenated with a sequence number (0-65535)
 250 [4 hex digits] making the total of 10 hexadecimal digits.
 251
 252 The date portion is constructed as:
 253
 254   my $date = ((((gmtime)[3] << 1) | $ntpflag) << 18) |  (time % 86400);
 255
 256 The sequence number is simply an unsigned short (or 16 bit) number
 257 starting at 0.
 258
 259 Each message originated at this node will increment the sequence
 260 number.
 261
 262 =item B<Hop>
 263
 264 This is a compulsory field. It is the number of hops from the
 265 originating node. It is incremented immediately on receipt and
 266 before determining its value.
 267
 268 So the originating node sends a message with a L</Hop> of 0, the
 269 neighbouring nodes must increment this field before passing
 270 it on to higher layers for onward processing.
 271
 272 Implementations may have an upper limit to this field and may
 273 silently drop incoming L</Messages> with a L</Hop> count greater than the
 274 limit.
 275
 276 =item B<From>
 277
 278 The L</From> field is optional. When present, it represents a L</Terminal> at
 279 the originating L</Node>. If it is missing then either it is not relevant or it
 280 is assumed to be the L</Origin>.
 281
 282 =back
 283
 284 =head2 Routing
 285
 286 It is assumed that nodes will be connected in a looped network with
 287 more than one route available (in many cases) to another node.
 288
 289 In anycase, most traffic is not directed, but broadcast to all users
 290 on all nodes.
 291
 292 Each message is uniquely identified by the (L</Origin>,L</TimeSeq>)
 293 tuple. The basic system will learn which interfaces can see what nodes
 294 by looking at the tuple and merging that with the L</Hop> count.
 295 Each interface remembers the latest L</TimeSeq> with the lowest L</Hop>
 296 for each L</Origin> that arrives on that interface. It also remembers
 297 the number of L</Messages> for that L</Origin> that has been received on
 298 that interface.
 299
 300 Any message for onward broadcast is duplicated and sent out on all
 301 interfaces that it did not come in on.
 302
 303 Any message that is directed to a particular node will be sent out on
 304 the "best" interface based on routing information gathered so far. If there
 305 is more than one possible route then, depending on network or local
 306 policy, the message may be duplicated and sent on other interfaces
 307 as well.
 308
 309 =head2 DeDuplication
 310
 311 On receipt of a message, its unique tuple (L</Origin>,L</TimeSeq>) is
 312 checked against a hash table. If it exists: the message is silently
 313 dropped. If it does not exist in the hash table then the tuple is
 314 added.
 315
 316 The hash table is periodically cleaned, removing tuples that
 317 have expired. The length of time a tuple remains in the hash table
 318 is implementation dependant but could easily be several days, if
 319 required.
 320
 321 This mechanism only ensures that a message broadcast around the network
 322 travels the least distance and through the fewest nodes possible. It
 323 is up to higher layers to make sure that data carried is not, itself,
 324 duplicated!
 325
 326 =head2 Examples
 327
 328  # on link startup from GB7BAA (both sides hello)
 329  GB7TLH,ROUTE,3D02350001,0|HELLO,Aranea,1.2,24.123
 330  GB7BAA,ROUTE,3D02355421,1|HELLO,Aranea,1.1,23.245
 331
 332  # on user startup to GB7TLH
 333  GB7TLH,ROUTE,3D042506F2,0,G1TLH|HELLO,PClient,1.3
 334
 335  # on user disconnection
 336  GB7TLH,ROUTE,3D9534F32D,0,G1TLH|BYE
 337
 338  # a talk (actually 'text') message to a user (some distance away
 339  # from the origin node)
 340  GB7TLH,G8TIC,3D03450019,3,G1TLH|T,Hiya Mike whats happening?
 341
 342  # a talk/chat/text message to a Group
 343  GB7TLH,VHF,0413525F23,2,G1TLH|T,2m is opening on MS
 344
 345  # a ping to find the whereabouts and distance of a user from a node
 346  # the hex number on the end is the ping ID
 347  GB7TLH,G7BRN,1512346543,0,G1TLH|PING,9F4D
 348
 349  # this effectively asks whether the user is on-line on a particular node
 350  GB7TLH,GB7BAA:G7BRN,1512346543,0,G1TLH|PING,35DE
 351
 352  # A possible reply, same ID as ping followed by the no of hops on the
 353  # ping that was received thus telling you how far away it is.
 354  GB7BAA,G1TLH,1512450534,3,G7BRN|PONG,35DE,3
 355
 356
 357 =head1 Command Section
 358
 359 The L</Command Section> of the message contains the actual data being
 360 passed. It is called the Command Section because all commands
 361 are identified with a L</Tag> each of which is implemented by
 362 the software using this protocol. Each </Tag> (usually) is followed by one
 363 or more L</Fields>.
 364
 365 =head2 Tag
 366
 367 The L</Tag> consists of string of uppercase letters and digits, starting
 368 with a leading, uppercase, letter. Tags should be as short as is meaningful.
 369
 370 Valid tags would be:
 371
 372  DX
 373  PC23
 374  ANN
 375
 376 Invalid tags include:
 377
 378  1AAA
 379  dx
 380  Ann
 381
 382 The L</Tag> is separated from its data L</Fields> by a comma ','.
 383
 384 =head2 Fields
 385
 386 All fields
 387 in any subsequent data shall be separated by a comma ','.
 388 All fields shall
 389 be HTTP encoded such that reserved characters (comma ',',
 390 vertical bar '|',
 391 percent '%',
 392 equals '='
 393 and non printable characters less than 127 (or %7F in hex)
 394 [including newline and carraige return] are translated to
 395 their two hex digit equivalent preceeded by the percent '%' character.
 396
 397 For example:
 398
 399  "%0D%0A" is "<carriage return><linefeed>".
 400  "hello%2C there" is "hello, there"
 401
 402 This is not standard CSV, fields are not quoted (delimited with either
 403 ' or ").
 404
 405 All national characters above 127 are UTF8 encoded in the
 406 standard perl 5.8.x way. It follows that all (perl) programs that
 407 are written according to this specification must say:
 408
 409  use UTF8;
 410
 411 A message (or line) is terminated with <carriage return><linefeed>
 412 0x0d 0x0a. Incoming L</Messages> must be accepted even when terminated
 413 with just <linefeed>.
 414
 415 Care must be taken to make sure that fields have any reserved characters
 416 encoded. In particular: it is perfectly permissible to have <linefeed>
 417 characters in a field - so long as they are escaped.
 418
 419 Fields come in two styles: either simple fields (just containing
 420 data) or B<key>=B<value> pairs. Each pair must be separated from
 421 the next by a comma ','. The B<key> must consist of the set of
 422 characters [a-z0-9_] (ie lowercase letters, digits and underscore),
 423 with a leading letter. The B<value> must be HTTP encoded as
 424 specified above and can otherwise contain any character.
 425
 426 There is no maximum size specified for a message. It is up to each
 427 implimentation to enforce one (if only for their own protection).
 428
 429 =head2 Standard Commands
 430
 431 There are a number of L</Standard Commands> which must be accepted by
 432 all implementations.
 433
 434 =over
 435
 436 =item B<HELLO>
 437
 438  HELLO,<software name>,<version>,<build>,<comments>
 439
 440 Command sent on connection to another node. Both sides send their information
 441 to the other. All the possible arguments are optional, although some of the
 442 arguments should be sent in order to help diagnose problems. This command is
 443 broadcast.
 444
 445 =item B<BYE>
 446
 447  BYE,<comments>
 448
 449 Command sent to all connections when the software is shutting down. This is sent
 450 by the node just before shutdown occurs. This is really only used to help the
 451 network prune its routing tables. It isn't a requirement. The <comment> field
 452 is optional.
 453
 454 =item B<DISC>
 455
 456  DISC,<node name>,<comments>
 457
 458 Command sent when a node has disconnected from this node. This message is sent when
 459 an interface shuts down. It need not be sent if a L<BYE> from an interface for
 460 that node has just been received. This command should be broadcast.
 461
 462 The <node name> is mandatory and is the name of the interface that has just
 463 disconnected.
 464
 465 =item B<PING>
 466
 467  PING,<user>,<ping id>
 468
 469 Command to send a ping to a node or user. This command is used both by the software
 470 and users to determine a) whether a node or user exists and b) how good the path is
 471 between them.
 472
 473 The <ping id> is a unique string which is usually the hexadecimal equivalent of an
 474 integer that is incremented every time it is used. But it can be anything that
 475 will identify this ping using the tuple (L<Origin>,<ping id>) as unique.
 476
 477 =item B<PONG>
 478
 479  PONG,<ping id>,<user>,<no of hops on ping>
 480
 481 Command to reply to a ping. This is sent as a reply to an incoming ping command.
 482 The <ping id> is the one supplied and the <no of hops on ping> is the number of
 483 hops it took for the ping to arrive.
 484
 485 =item B<T>
 486
 487  T,<text>
 488
 489 All implementations must be able to send "text" (encoded as specified in
 490 L</Fields>). There would be little point in doing all this otherwise!
 491
 492 =back
 493
 494 =head1 AUTHOR
 495
 496 Dirk Koopman, G1TLH, E<lt>djk@tobit.co.ukE<gt>
 497
 498 =head1 COPYRIGHT AND LICENSE
 499
 500 Copyright 2004-2005 by Dirk Koopman, G1TLH
 501
 502 This library is free software; you can redistribute it and/or modify
 503 it under the same terms as Perl itself.
 504
 505 $Revision$
 506
 507 =cut
 508
 509