[Ndn-interest] any comments on naming convention?

Thompson, Jeff jefft0 at remap.ucla.edu
Mon Sep 15 12:12:05 PDT 2014


Hi Marc. You say "if the T comes before the L, then the short-lex ordering does not work" meaning that the ordering will not depend on the length of the name component "value" but on the type.

It seems Junxiao worried about this too when he said "It's unnecessary to compare marker code and BYTE* individually, because most applications won't have different markers under the same prefix."

Is there a use case where it matters that short-lex odering is thrown off when comparing two name components with different types?  Is it safe to assume that an application will always be doing short-lex comparison of two name components of the same type (for example, leftmost child of two version components)?

- Jeff T


From: "Marc.Mosko at parc.com<mailto:Marc.Mosko at parc.com>" <Marc.Mosko at parc.com<mailto:Marc.Mosko at parc.com>>
Date: Monday, September 15, 2014 11:50 AM
To: Jeff Thompson <jefft0 at remap.ucla.edu<mailto:jefft0 at remap.ucla.edu>>
Cc: "shijunxiao at email.arizona.edu<mailto:shijunxiao at email.arizona.edu>" <shijunxiao at email.arizona.edu<mailto:shijunxiao at email.arizona.edu>>, "ndn-interest at lists.cs.ucla.edu<mailto:ndn-interest at lists.cs.ucla.edu>" <ndn-interest at lists.cs.ucla.edu<mailto:ndn-interest at lists.cs.ucla.edu>>
Subject: Re: [Ndn-interest] any comments on naming convention?

On Sep 15, 2014, at 11:33 AM, Thompson, Jeff <jefft0 at remap.ucla.edu<mailto:jefft0 at remap.ucla.edu>> wrote:

Hi Mark,

Thanks for the clear summary.  You say "it became clear that it is difficult to have a “strcmp()” style comparison over the raw TLV bytes with a variable-length T and L encoding."  Can you say more about why varible-length encoding makes strcmp difficult?

At the time, we were having discussions about is a 2-byte “0” different than a 1-byte “0”, for example.  If they are the same meaning, but one is just incorrectly encoded in 2-bytes, then do we have to validate each T and throw away the ones that are mis-encoded?

Also, if the T comes before the L, then the short-lex ordering does not work.  Short-lex says that name component A is less than B if then length of A is less than B or of |A| = |B| and A sorts before B.  If the T comes before the L, then you cannot simply do a strcmp() because the variable length T’s will throw things off.  All you can say is that within a T value, you use short-lex.

Marc

- Jeff T

From: "Marc.Mosko at parc.com<mailto:Marc.Mosko at parc.com>" <Marc.Mosko at parc.com<mailto:Marc.Mosko at parc.com>>
Date: Monday, September 15, 2014 11:20 AM
To: "shijunxiao at email.arizona.edu<mailto:shijunxiao at email.arizona.edu>" <shijunxiao at email.arizona.edu<mailto:shijunxiao at email.arizona.edu>>
Cc: "ndn-interest at lists.cs.ucla.edu<mailto:ndn-interest at lists.cs.ucla.edu>" <ndn-interest at lists.cs.ucla.edu<mailto:ndn-interest at lists.cs.ucla.edu>>
Subject: Re: [Ndn-interest] any comments on naming convention?

This is an interesting discussion.  At PARC, when we went away from ccnb to TLV-based name components, we agreed with the Cisco position that different types of name components should have different TLV types.

Anything that used to be a command marker was moved to a TLV type and we no longer use command markers.  We see having TLV types in the name as redundant with command markers, so long as there is a type space for applications to use to generate their own application-dependent types.

We use one general name (binary) name component, one for versions, one for segments (chunks), one for nonces (in the name, not an Interest nonce), one for keys.  In our re-implementation of the 0.x repo protocol, those repo command-markers became their own application-dependent name TLV types.  In our sync protocol, we use other application-dependent TLV types instead of command markers.

Our ordering is defined as the lexicographic compare of each TLV, including the T and L.  Because we use a fixed type and fixed length value, this ordering is always well-defined.  About a year ago, when we were considering different variable length TLV schemes, it became clear that it is difficult to have a “strcmp()” style comparison over the raw TLV bytes with a variable-length T and L encoding.  There are some T and L encoding schemes that still allow comparison over the raw bytes, but they have their own drawbacks.

One solution is to declare many new TLV types: VersionComponent, SegmentComponent, TimestampComponent, etc.
This can guarantee unambiguity, but this restricts the introduction of new convention, because when we want to introduce another convention in the future, old consumer applications would not understand the new TLV type.

I would disagree with this statement.  Anytime you introduce a new command-marker, old applications will not understand it.  If the new command-marker is required for application execution, then all applications must be updated.  If the new command-marker (or tlv type) is not required, then the old application should continue just fine treating the type as opaque.

Marc Mosko


On Sep 15, 2014, at 10:49 AM, Junxiao Shi <shijunxiao at email.arizona.edu<mailto:shijunxiao at email.arizona.edu>> wrote:

Dear folks

I agree with @MarkStapp that Naming Conventions rev1 does not guarantee version/segment components to be unambiguous.
One alternate proposal was to use an additional NameComponent before the number as a marker, such as "_v/<version>" "_s/<segment>". This alternate proposal is also unable to make version/segment components unambiguous, and it doesn't work well with ChildSelector.

One easy solution to this problem is: restrict the octets to be used in regular names.
In rev1, we could require regular NameComponent to start with a valid UTF8 character.
In alternate proposal, we could forbid regular NameComponent to start with "_".
However, this solution is undesirable, because some applications do need to operate with binary components (eg. SignatureBits component in signed Interest).

NDN-TLV 0.2.0 (unapproved spec) introduces NumberComponent <http://gerrit.named-data.net/gitweb?p=NDN-TLV.git;a=blob;f=name.rst;h=922bb1cdd568e90bb9e36e8b9339d6c819d2cf06;hb=1ac334640da059791ad5c75637eee075fd0b87b3#l11> to indicate a component is a number.
This is insufficient because it doesn't say the meaning of a number: is it a version number or a segment number?

One solution is to declare many new TLV types: VersionComponent, SegmentComponent, TimestampComponent, etc.
This can guarantee unambiguity, but this restricts the introduction of new convention, because when we want to introduce another convention in the future, old consumer applications would not understand the new TLV type.


If I'm to redesign the convention, I would introduce a MarkedComponent TLV type.
The MarkedComponent TLV can appear in place of NameComponent.
The value part of a MarkedComponent contains a VAR-NUMBER which is a marker code, followed by zero or more arbitrary octets.

Name ::= NAME-TYPE TLV-LENGTH (NameComponent | MarkedComponent)*
FinalBlockId ::= FINAL-BLOCK-ID-TYPE TLV-LENGTH (NameComponent | MarkedComponent)
MarkedComponent ::= MARKED-COMPONENT-TYPE TLV-LENGTH VAR-NUMBER BYTE*

The canonical order is defined as:

  *   A MarkedComponent is less than any NameComponent.
  *   Two MarkedComponents are compared by their length and value, in the same way as NameComponent.

The benefits of this solution is:

  *   Version/segment/etc components are distinguished from regular NameComponent, because they have a distinct TLV-TYPE: MarkedComponent.
  *   Adding a new convention only needs allocation of a marker code. No new TLV type is introduced, so that old consumer can continue to work.
  *   Encoding marker code as VAR-NUMBER allows much larger marker space than restricting to one-octet marker.
  *   Canonical order evaluation is efficient. It's unnecessary to compare marker code and BYTE* individually, because most applications won't have different markers under the same prefix.


Yours, Junxiao
_______________________________________________
Ndn-interest mailing list
Ndn-interest at lists.cs.ucla.edu<mailto:Ndn-interest at lists.cs.ucla.edu>
http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.cs.ucla.edu/pipermail/ndn-interest/attachments/20140915/575b048b/attachment.html>


More information about the Ndn-interest mailing list