techdoc/protocol.pod

   1 # -*- perl -*-
   2 =head1 NAME
   3
   4 Aranea Orthogonal Communications Protocol
   5
   6 $Revision$
   7
   8 =head1 SYNOPSIS
   9
  10  <Origin>,<Group>,<TimeSeq>,<Hop>[,<From>]|<Tag>,<Data>...
  11
  12 =head1 ABSTRACT
  13
  14 For many years DX Clusters have used a protocol which was designed
  15 for a non-looped tree ofL</Node>s. This environment has probably never, reliably,
  16 been achieved in practice; certainly not recently.
  17
  18 There have always been loops, sometimes bringing the network to its
  19 knees. In modern usage, both in order to get some resilience and also
  20 to expedite information flow, we use internet based, deliberately
  21 looped networks with filtering. Whilst this works, after a fashion, there
  22 are all sorts of problems that the current PC protocol can never
  23 address.
  24
  25 This document
  26 describes a complete replacement for the PC protocol. It allows a
  27 fully looped network, is inherently extensible and should be simple
  28 to implement (especially in perl).
  29
  30 All implementations of this protocol shall B<only> use this protocol
  31 for inter-node communications.
  32
  33 =head1 DESCRIPTION
  34
  35 This protocol is
  36 designed to be an extensible basis for any type of one -> many
  37 "instant" line-based communications tasks.
  38
  39 This protocol is designed to be flood routed in a meshed network in
  40 as efficient a manner as possible. The reason we have chosen this
  41 mechanism is that most L</Messages> need to be broadcast to allL</Node>s.
  42
  43 Experience has shown thatL</Node>s will appear and (more infrequently)
  44 disappear without much (or any) notice.
  45 Therefore, the constantly changing and uncoordinated
  46 nature of the network doesn't lend itself to fixed routing policies.
  47
  48 Having said that: directed routing is available where routes have
  49 been learned through past traffic.
  50 Those L</Messages> that could be routed (mainly single line one to
  51 one "talk" L</Messages>)
  52 happen sufficiently infrequently that, should they need to be flood routed
  53 (because no route has been learned yet) it is a small cost overall.
  54
  55 =head1 Messages
  56
  57 A message is a single line of UTF8 encoded and HTTP escaped text
  58 terminated in the standard internet manner with a <CR><LF>.
  59
  60 Each message consists of a L</Routing Section> and a L</Command Section>.
  61 The two sections are separated with the '|' character.
  62 It follows that these
  63 characters (as well as non-printable characters, <CR>, <LF> and
  64 a small number of other reserved characters)
  65 can only be sent escaped. This is described further in the
  66 L</Command Section> and L</Fields>.
  67
  68 Most of this document is concerned with the L</Routing Section>, however
  69 some L</Standard Commands> which all implementations should issue and
  70 must accept are described.
  71
  72 =head1 Applications
  73
  74 In the past messaging applications such as DX Cluster software have maintained
  75 a fairly strict division between L</Node>s and L</User>s. This protocol attempts
  76 to get away from that by deliberately blurring (or, in some cases, removing)
  77 any distinction between the two.
  78
  79 Applications that use this protocol are essentially all peers and therefore
  80 nodes the only real difference between L</Node>s and L</User>s is that a "node" has one or more
  81 listeners running that will,
  82 potentially, allow incoming connections from other L</Node>s, L</Endpoint>s or L</User>s. These
  83 routable entities are called L</Terminal>s.
  84
  85 Any application that is a sink and/or source of data for L</Group>s; is capable of obeying
  86 the protocol message construction rules and understands how to deduplicate incoming messages
  87 correctly can operate as a routeable entity or L</Terminal> in this protocol. It is called an L</Endpoint>.
  88
  89 An L</Endpoint> is called a L</Node> if it accepts connections from L</Endpoint>s and is
  90 prepared to route messages on their behalf to other L</Node>s or L</Endpoint>. In addition it
  91 may provide some other, usually simpler, interface (eg simple telnet access) for direct user access.
  92
  93 The concept of an L</Endpoint> has been invented because modern clients are
  94 capable of being intelligent than simple
  95 character based connections such as telnet or ax25. They wish to be able to
  96 distinguish between the various classes of message, such as: DX spots,
  97 announces, talk, logging info etc. It is a pain to have to do it, as now,
  98 by trying to make sense of the (slightly different for each piece of node
  99 software) human readable "user" version of the output. Far better to pass on
 100 regular, specified, easily computer decodable versions of the message,
 101 i.e. in this protocol, and leave
 102 the human presentation to the application implementing the L</Endpoint>.
 103
 104 It also helps to modularise the various interfaces that may be implemented such
 105 as the  legacy, character based connections of existing PC protocol based nodes.
 106 They should be treated
 107 as local clients, in fact as L</User>s, B<not> as peers in this protocol. It is likely that, in order
 108 to do this, some extra L</Tag>s will need to be defined at application level.
 109
 110 =head1 Definitions
 111
 112 In this document we use a number of terms that need to be defined.
 113
 114 =head2 Terminal
 115
 116 A L</Terminal> is a routable entity, in other words: a callsign or service that can be routed
 117 to, that lives at one or a few L</Node>s.
 118
 119 =head2 User
 120
 121 A L</User> is a connection to a L</Node> (that allows such connections)
 122 that does not occur in protocol. All L</User>s shall be identified with a name
 123 of up to 12 characters in the set [-0-9A-Z_]. All messages have to be routed via the
 124 L</Node> to which this L</User> is connected.
 125
 126 =head2 Endpoint
 127
 128 An L</Endpoint> is a connection to a L<Node> that uses the protocol. From a routing point of
 129 view, it is indistiguishable from a L</Node>. The L</Endpoint> is responsible for creating and decoding
 130 well formed protocol messages. An L</Endpoint> does not route beyond the immediate L</Node>(s) to
 131 which it is connected. It may also be a L</Service> connected to a L</Node> which provides some
 132 addressable service (such as a database) that can be queried.
 133
 134 =head2 Node
 135
 136 A L</Node> is connected to other L</Node>s. It is responsible for routing messages in protocol
 137 from other L</Node>s or L</Endpoint>s, whether directly connected or not. Optionally, a L</Node>
 138 may provide other interfaces, such as direct L</User> connections or legacy PC protocol speaking
 139 DX Clusters.
 140
 141 =head2 Channel
 142
 143 A L</Channel> is a L</Group> address that is not a L</Terminal>. It is (unless qualified by a L</Terminal>)
 144 broadcast on all a L</Node>s interfaces unless preventing by some filtering or other local policy on
 145 that L</Node>.
 146
 147 =head2 Service
 148
 149 A L</Service> is application that either plugs into or connects as an L</Endpoint> to a L</Node>. It is an
 150 application that, in effect, is a database. In other words: queries are sent to the L</Service> and it sends
 151 back a reply.
 152
 153 =head1 Routing Section
 154
 155 The application that implements this protocol is essentially a line
 156 oriented message router. One line equals one message. Each line is
 157 effectively a datagram.
 158
 159 It is assumed thatL</Node>s are connected to
 160 each other using a "reliable" streaming protocol such as TCP/IP or
 161 AX25. Having said that: in context, L</Messages> in this protocol could be
 162 multi/broadcast, either "as is" or wrapped in some other framing
 163 protocol.
 164
 165 Although the physical transport between L</Node>s is reliable, the actual message
 166 is unreliable, because this is an unreliable, best effort, "please route my packets
 167 through your node" protocol. There is no guarantee that a message
 168 will get to the other side of a mesh of L</Node>s. There may be a
 169 discontinuity either caused by outage or deliberate filtering.
 170
 171 However, as it is envisaged that most L</Messages> will be flood routed or,
 172 in the case of directed L</Messages> (those that have L</Group> that is a callsign down some/most/all interfaces showing a route for that
 173 direction, it is unlikely that L</Messages> will be lost in practice.
 174
 175 Assuming that there is a path between all the L</Node>s in a network, then it is guaranteed
 176 that a message will be delivered everywhere, eventually. It is possible (indeed likely) that
 177 copies of  a message
 178 will arrive at L</Node>s more than once. L</Node>s are responsible for deduplicating those messages
 179 using the information in the L</Routing Section>.
 180
 181 =head2 Field Description
 182
 183 All fields in the L</Routing Section> are compulsory except the L</From> field. If it is missing
 184 so is the separating comma.
 185
 186 The L</Hop> field is incremented on receipt of a message on a node.
 187
 188 Fields are separated by the comma ',' character with the last field
 189 required followed by the vertical bar '|' character.
 190
 191 The characters allowed in the routing section are restricted. Any
 192 invalid characters in any field will cause the whole message to be
 193 silently dropped.
 194
 195 More detailed descriptions of the fields follow:
 196
 197 =over
 198
 199 =item B<Origin>
 200
 201 This is a compulsory field. It is the name of the originating node.
 202 The field can contain up to 12 characters in the set [-A-Z0-9_/] in
 203 any order. Higher layers may restrict this further.
 204
 205 The field must not be changed by any other node.
 206
 207 =item B<Group>
 208
 209 This is the Group (or Channel) to be used for this data. It is compulsory.
 210
 211 It is a string of up to 12 characters
 212 in the set [-A-Z0-9_/] in any order.
 213
 214 Optionally, for extra routing to
 215 a particular L</Terminal> connected at a specific L</Node>, or even a
 216 particular L</Terminal> in a L</Group>,
 217 it may have another 12 character
 218 string in the same set, concatenated with the first string. The two strings are separated by a ':'
 219 character. For example:
 220
 221   DX                        # the DX group
 222   GB7DJK                    # the node GB7DJK
 223   G1TLH                     # the user or endpoint G1TLH
 224   GB7DJK:G1TLH              # the user G1TLH at GB7DJK
 225   DX:G1TLH                  # the user G1TLH in the DX group
 226
 227 This field can contain either a L</Terminal> or some other string which is interpreted
 228 as broadcastable group address. Any message that has a L</Group> that is not recognised as a L</Terminal> must
 229 be broadcast.
 230
 231 This means that messages to callsigns, for whom no specific routing information is available,
 232 will be found by means of a broadcast. Hopefully this will cause some kind of activity o.b.o
 233 that callsign will allow routing tables to be gathered that narrow down the scope of any future
 234 message to that callsign through the network.
 235
 236 Remember that not all L</Node>s may pass every L</Group> field, depending on local policy.
 237
 238 =item B<TimeSeq>
 239
 240 This is a compulsory field. It is a 10 hexadecimal digit string which
 241 consists of a day no (1-31),
 242 a flag to indicate NTP syncronisation in use,
 243 seconds within that day (0-86399) [total of 6 hex digits]
 244 that are concatenated with a sequence number (0-65535)
 245 [4 hex digits] making the total of 10 hexadecimal digits.
 246
 247 The date portion is constructed as:
 248
 249   my $date = ((((gmtime)[3] < 1) | $ntpflag) < 18) |  (time % 86400);
 250
 251 The sequence number is simply an unsigned short (or 16 bit) number
 252 starting at 0.
 253
 254 Each message originated at this node will increment the sequence
 255 number.
 256
 257 =item B<Hop>
 258
 259 This is a compulsory field. It is the number of hops from the
 260 originating node. It is incremented immediately on receipt and
 261 before determining its value.
 262
 263 So the originating node sends a message with a L</Hop> of 0, the
 264 neighbouring nodes must increment this field before passing
 265 it on to higher layers for onward processing.
 266
 267 Implementations may have an upper limit to this field and may
 268 silently drop incoming L</Messages> with a L</Hop> count greater than the
 269 limit.
 270
 271 =item B<From>
 272
 273 The L</From> field is optional. When present, it represents a L</Terminal> at
 274 the originating L</Node>. If it is missing then either it is not relevant or it
 275 is assumed to be the L</Origin>.
 276
 277 =back
 278
 279 =head2 Routing
 280
 281 It is assumed that nodes will be connected in a looped network with
 282 more than one route available (in many cases) to another node.
 283
 284 In anycase, most traffic is not directed, but broadcast to all users
 285 on all nodes.
 286
 287 Each message is uniquely identified by the (L</Origin>,L</TimeSeq>)
 288 tuple. The basic system will learn which interfaces can see what nodes
 289 by looking at the tuple and merging that with the L</Hop> count.
 290 Each interface remembers the latest L</TimeSeq> with the lowest L</Hop>
 291 for each L</Origin> that arrives on that interface. It also remembers
 292 the number of L</Messages> for that L</Origin> that has been received on
 293 that interface.
 294
 295 Any message for onward broadcast is duplicated and sent out on all
 296 interfaces that it did not come in on.
 297
 298 Any message that is directed to a particular node will be sent out on
 299 the "best" interface based on routing information gathered so far. If there
 300 is more than one possible route then, depending on network or local
 301 policy, the message may be duplicated and sent on other interfaces
 302 as well.
 303
 304 =head2 DeDuplication
 305
 306 On receipt of a message, its unique tuple (L</Origin>,L</TimeSeq>) is
 307 checked against a hash table. If it exists: the message is silently
 308 dropped. If it does not exist in the hash table then the tuple is
 309 added.
 310
 311 The hash table is periodically cleaned, removing tuples that
 312 have expired. The length of time a tuple remains in the hash table
 313 is implementation dependant but could easily be several days, if
 314 required.
 315
 316 This mechanism only ensures that a message broadcast around the network
 317 travels the least distance and through the fewest nodes possible. It
 318 is up to higher layers to make sure that data carried is not, itself,
 319 duplicated!
 320
 321 =head2 Examples
 322
 323  # on link startup from GB7BAA (both sides hello)
 324  GB7TLH,ROUTE,3D02350001,0|HELLO,Aranea,1.2,24.123
 325  GB7BAA,ROUTE,3D02355421,1|HELLO,Aranea,1.1,23.245
 326
 327  # on user startup to GB7TLH
 328  GB7TLH,ROUTE,3D042506F2,0,G1TLH|HELLO,PClient,1.3
 329
 330  # on user disconnection
 331  GB7TLH,ROUTE,3D9534F32D,0,G1TLH|BYE
 332
 333  # a talk (actually 'text') message to a user (some distance away
 334  # from the origin node)
 335  GB7TLH,G8TIC,3D03450019,3,G1TLH|THiya Mike what's happening?
 336
 337  # a talk/chat/text message to a Group
 338  GB7TLH,VHF,0413525F23,2,G1TLH|T,2m is opening on MS
 339
 340  # a ping to find the whereabouts and distance of a user from a node
 341  # the hex number on the end is the ping ID
 342  GB7TLH,G7BRN,1512346543,0,G1TLH|PING,9F4D
 343
 344  # this effectively asks whether the user is on-line on a particular node
 345  GB7TLH,GB7BAA:G7BRN,1512346543,0,G1TLH|PING,35DE
 346
 347  # A possible reply, same ID as ping followed by the no of hops on the
 348  # ping that was received thus telling you how far away it is.
 349  GB7BAA,G1TLH,1512450534,3,G7BRN|PONG,35DE,3
 350
 351
 352 =head1 Command Section
 353
 354 The L</Command Section> of the message contains the actual data being
 355 passed. It is called the Command Section because all commands
 356 are identified with a L</Tag> each of which is implemented by
 357 the software using this protocol. Each </Tag> (usually) is followed by one
 358 or more L</Fields>.
 359
 360 =head2 Tag
 361
 362 The L</Tag> consists of string of uppercase letters and digits, starting
 363 with a leading, uppercase, letter. Tags should be as short as is meaningful.
 364
 365 Valid tags would be:
 366
 367  DX
 368  PC23
 369  ANN
 370
 371 Invalid tags include:
 372
 373  1AAA
 374  dx
 375  Ann
 376
 377 The L</Tag> is separated from its data L</Fields> by a comma ','.
 378
 379 =head2 Fields
 380
 381 All fields
 382 in any subsequent data shall be separated by a comma ','.
 383 All fields shall
 384 be HTTP encoded such that reserved characters (comma ',',
 385 vertical bar '|',
 386 percent '%',
 387 equals '='
 388 and non printable characters less than 127 (or %7F in hex)
 389 [including newline and carraige return] are translated to
 390 their two hex digit equivalent preceeded by the percent '%' character.
 391
 392 For example:
 393
 394  "%0D%0A" is "<carriage return><linefeed>".
 395  "hello%2C there" is "hello, there"
 396
 397 This is not standard CSV, fields are not quoted (delimited with either
 398 ' or ").
 399
 400 All national characters above 127 are UTF8 encoded in the
 401 standard perl 5.8.x way. It follows that all (perl) programs that
 402 are written according to this specification must say:
 403
 404  use UTF8;
 405
 406 A message (or line) is terminated with <carriage return><linefeed>
 407 0x0d 0x0a. Incoming L</Messages> must be accepted even when terminated
 408 with just <linefeed>.
 409
 410 Care must be taken to make sure that fields have any reserved characters
 411 encoded. In particular: it is perfectly permissible to have <linefeed>
 412 characters in a field - so long as they are escaped.
 413
 414 Fields come in two styles: either simple fields (just containing
 415 data) or B<key>=B<value> pairs. Each pair must be separated from
 416 the next by a comma ','. The B<key> must consist of the set of
 417 characters [a-z0-9_] (ie lowercase letters, digits and underscore),
 418 with a leading letter. The B<value> must be HTTP encoded as
 419 specified above and can otherwise contain any character.
 420
 421 There is no maximum size specified for a message. It is up to each
 422 implimentation to enforce one (if only for their own protection).
 423
 424 =head2 Standard Commands
 425
 426 There are a number of L</Standard Commands> which must be accepted by
 427 all implementations.
 428
 429 =over
 430
 431 =item B<HELLO>
 432
 433  HELLO,<software name>,<version>,<build>,<comments>
 434
 435 Command sent on connection to another node. Both sides send their information
 436 to the other. All the possible arguments are optional, although some of the
 437 arguments should be sent in order to help diagnose problems. This command is
 438 broadcast.
 439
 440 =item B<BYE>
 441
 442  BYE,<comments>
 443
 444 Command sent to all connections when the software is shutting down. This is sent
 445 by the node just before shutdown occurs. This is really only used to help the
 446 network prune its routing tables. It isn't a requirement. The <comment> field
 447 is optional.
 448
 449 =item B<DISC>
 450
 451  DISC,<node name>,<comments>
 452
 453 Command sent when a node has disconnected from this node. This message is sent when
 454 an interface shuts down. It need not be sent if a L<BYE> from an interface for
 455 that node has just been received. This command should be broadcast.
 456
 457 The <node name> is mandatory and is the name of the interface that has just
 458 disconnected.
 459
 460 =item B<PING>
 461
 462  PING,<user>,<ping id>
 463
 464 Command to send a ping to a node or user. This command is used both by the software
 465 and users to determine a) whether a node or user exists and b) how good the path is
 466 between them.
 467
 468 The <ping id> is a unique string which is usually the hexadecimal equivalent of an
 469 integer that is incremented every time it is used. But it can be anything that
 470 will identify this ping using the tuple (L<Origin>,<ping id>) as unique.
 471
 472 =item B<PONG>
 473
 474  PONG,<ping id>,<user>,<no of hops on ping>
 475
 476 Command to reply to a ping. This is sent as a reply to an incoming ping command.
 477 The <ping id> is the one supplied and the <no of hops on ping> is the number of
 478 hops it took for the ping to arrive.
 479
 480 =item B<T>
 481
 482  T,<text>
 483
 484 All implementations must be able to send "text" (encoded as specified in
 485 L</Fields>). There would be little point in doing all this otherwise!
 486
 487 =back
 488
 489 =head1 AUTHOR
 490
 491 Dirk Koopman, G1TLH, E<lt>djk@tobit.co.ukE<gt>
 492
 493 =head1 COPYRIGHT AND LICENSE
 494
 495 Copyright 2004-2005 by Dirk Koopman, G1TLH
 496
 497 This library is free software; you can redistribute it and/or modify
 498 it under the same terms as Perl itself.
 499
 500 $Revision$
 501
 502 =cut
 503
 504