[Ndn-interest] any comments on naming convention?

Ignacio.Solis at parc.com Ignacio.Solis at parc.com
Thu Sep 18 15:37:01 PDT 2014


IMO it’s not good to equate URLs to NDN/CCN network names.   In the case of CCN, we carry data in the interest payload field, so for us names smaller than 96 bytes are possible and probable. Longer names may exist for some manifests, we don’t have a clear notion of the frequency (or ratio) of those yet.

In terms of URL sizes, there is quite a bit of info on that from some of the Cisco work and the datasets out there.   (See http://www.ietf.org/proceedings/89/slides/slides-89-icnrg-10.pdf )

Note that the size of the complete name may not be an issue in some forwarding situations because you’re looking for routing prefixes.   Finally, if the size of the complete name matters, because you’re trying to hash/match the whole thing, then the index is not going to help you.

Nacho


--
Nacho (Ignacio) Solis
Protocol Architect
Principal Scientist
Palo Alto Research Center (PARC)
+1(650)812-4458
Ignacio.Solis at parc.com

On 9/18/14, 3:27 PM, "Felix Rabe" <felix at rabe.io<mailto:felix at rabe.io>> wrote:

Would that chip be suitable, i.e. can we expect most names to fit in (the magnitude of) 96 bytes? What length are names usually in current NDN experiments?

I guess wide deployment could make for even longer names. Related: Many URLs I encounter nowadays easily don't fit within two 80-column text lines, and NDN will have to carry more information than URLs, as far as I see.

On 18/Sep/14 23:15, Marc.Mosko at parc.com<mailto:Marc.Mosko at parc.com> wrote:

In fact, the index in separate TLV will be slower on some architectures, like the ezChip NP4.  The NP4 can hold the fist 96 frame bytes in memory, then any subsequent memory is accessed only as two adjacent 32-byte blocks (there can be at most 5 blocks available at any one time).  If you need to switch between arrays, it would be very expensive.  If you have to read past the name to get to the 2nd array, then read it, then backup to get to the name, it will be pretty expensive too.

Marc

On Sep 18, 2014, at 2:02 PM, <Ignacio.Solis at parc.com><mailto:Ignacio.Solis at parc.com> <Ignacio.Solis at parc.com><mailto:Ignacio.Solis at parc.com> wrote:



Does this make that much difference?

If you want to parse the first 5 components.  One way to do it is:

Read the index, find entry 5, then read in that many bytes from the start
offset of the beginning of the name.
OR
Start reading name, (find size + move ) 5 times.

How much speed are you getting from one to the other?  You seem to imply
that the first one is faster.  I don¹t think this is the case.

In the first one you¹ll probably have to get the cache line for the index,
then all the required cache lines for the first 5 components.  For the
second, you¹ll have to get all the cache lines for the first 5 components.
 Given an assumption that a cache miss is way more expensive than
evaluating a number and computing an addition, you might find that the
performance of the index is actually slower than the performance of the
direct access.

Granted, there is a case where you don¹t access the name at all, for
example, if you just get the offsets and then send the offsets as
parameters to another processor/GPU/NPU/etc.  In this case you may see a
gain IF there are more cache line misses in reading the name than in
reading the index.   So, if the regular part of the name that you¹re
parsing is bigger than the cache line (64 bytes?) and the name is to be
processed by a different processor, then your might see some performance
gain in using the index, but in all other circumstances I bet this is not
the case.   I may be wrong, haven¹t actually tested it.

This is all to say, I don¹t think we should be designing the protocol with
only one architecture in mind. (The architecture of sending the name to a
different processor than the index).

If you have numbers that show that the index is faster I would like to see
under what conditions and architectural assumptions.

Nacho

(I may have misinterpreted your description so feel free to correct me if
I¹m wrong.)


--
Nacho (Ignacio) Solis
Protocol Architect
Principal Scientist
Palo Alto Research Center (PARC)
+1(650)812-4458
Ignacio.Solis at parc.com<mailto:Ignacio.Solis at parc.com>





On 9/18/14, 12:54 AM, "Massimo Gallo" <massimo.gallo at alcatel-lucent.com><mailto:massimo.gallo at alcatel-lucent.com>
wrote:



Indeed each components' offset must be encoded using a fixed amount of
bytes:

i.e.,
Type = Offsets
Length = 10 Bytes
Value = Offset1(1byte), Offset2(1byte), ...

You may also imagine to have a "Offset_2byte" type if your name is too
long.

Max

On 18/09/2014 09:27, Tai-Lin Chu wrote:


if you do not need the entire hierarchal structure (suppose you only
want the first x components) you can directly have it using the
offsets. With the Nested TLV structure you have to iteratively parse
the first x-1 components. With the offset structure you cane directly
access to the firs x components.


I don't get it. What you described only works if the "offset" is
encoded in fixed bytes. With varNum, you will still need to parse x-1
offsets to get to the x offset.



On Wed, Sep 17, 2014 at 11:57 PM, Massimo Gallo
<massimo.gallo at alcatel-lucent.com><mailto:massimo.gallo at alcatel-lucent.com> wrote:


On 17/09/2014 14:56, Mark Stapp wrote:


ah, thanks - that's helpful. I thought you were saying "I like the
existing NDN UTF8 'convention'." I'm still not sure I understand what
you
_do_ prefer, though. it sounds like you're describing an entirely
different
scheme where the info that describes the name-components is ...
someplace
other than _in_ the name-components. is that correct? when you say
"field
separator", what do you mean (since that's not a "TL" from a TLV)?


Correct.
In particular, with our name encoding, a TLV indicates the name
hierarchy
with offsets in the name and other TLV(s) indicates the offset to use
in
order to retrieve special components.
As for the field separator, it is something like "/". Aliasing is
avoided as
you do not rely on field separators to parse the name; you use the
"offset
TLV " to do that.

So now, it may be an aesthetic question but:

if you do not need the entire hierarchal structure (suppose you only
want
the first x components) you can directly have it using the offsets.
With the
Nested TLV structure you have to iteratively parse the first x-1
components.
With the offset structure you cane directly access to the firs x
components.

Max




-- Mark

On 9/17/14 6:02 AM, Massimo Gallo wrote:


The why is simple:

You use a lot of "generic component type" and very few "specific
component type". You are imposing types for every component in order
to
handle few exceptions (segmentation, etc..). You create a rule
(specify
the component's type ) to handle exceptions!

I would prefer not to have typed components. Instead I would prefer
to
have the name as simple sequence bytes with a field separator. Then,
outside the name, if you have some components that could be used at
network layer (e.g. a TLV field), you simply need something that
indicates which is the offset allowing you to retrieve the version,
segment, etc in the name...


Max





On 16/09/2014 20:33, Mark Stapp wrote:


On 9/16/14 10:29 AM, Massimo Gallo wrote:


I think we agree on the small number of "component types".
However, if you have a small number of types, you will end up with
names
containing many generic components types and few specific
components
types. Due to the fact that the component type specification is an
exception in the name, I would prefer something that specify
component's
type only when needed (something like UTF8 conventions but that
applications MUST use).



so ... I can't quite follow that. the thread has had some
explanation
about why the UTF8 requirement has problems (with aliasing, e.g.)
and
there's been email trying to explain that applications don't have to
use types if they don't need to. your email sounds like "I prefer
the
UTF8 convention", but it doesn't say why you have that preference in
the face of the points about the problems. can you say why it is
that
you express a preference for the "convention" with problems ?

Thanks,
Mark



.



_______________________________________________
Ndn-interest mailing list
Ndn-interest at lists.cs.ucla.edu<mailto:Ndn-interest at lists.cs.ucla.edu>http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest

_______________________________________________
Ndn-interest mailing list
Ndn-interest at lists.cs.ucla.edu<mailto:Ndn-interest at lists.cs.ucla.edu>http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest

_______________________________________________
Ndn-interest mailing list
Ndn-interest at lists.cs.ucla.edu<mailto:Ndn-interest at lists.cs.ucla.edu>http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest



_______________________________________________
Ndn-interest mailing list
Ndn-interest at lists.cs.ucla.edu<mailto:Ndn-interest at lists.cs.ucla.edu>http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.cs.ucla.edu/pipermail/ndn-interest/attachments/20140918/eafdf36e/attachment.html>


More information about the Ndn-interest mailing list