[Ndn-interest] any comments on naming convention?

Thu Sep 18 14:02:26 PDT 2014

Does this make that much difference?

If you want to parse the first 5 components.  One way to do it is:

Read the index, find entry 5, then read in that many bytes from the start
offset of the beginning of the name.
OR
Start reading name, (find size + move ) 5 times.

How much speed are you getting from one to the other?  You seem to imply
that the first one is faster.  I don¹t think this is the case.

In the first one you¹ll probably have to get the cache line for the index,
then all the required cache lines for the first 5 components.  For the
second, you¹ll have to get all the cache lines for the first 5 components.
  Given an assumption that a cache miss is way more expensive than
evaluating a number and computing an addition, you might find that the
performance of the index is actually slower than the performance of the
direct access.

Granted, there is a case where you don¹t access the name at all, for
example, if you just get the offsets and then send the offsets as
parameters to another processor/GPU/NPU/etc.  In this case you may see a
gain IF there are more cache line misses in reading the name than in
reading the index.   So, if the regular part of the name that you¹re
parsing is bigger than the cache line (64 bytes?) and the name is to be
processed by a different processor, then your might see some performance
gain in using the index, but in all other circumstances I bet this is not
the case.   I may be wrong, haven¹t actually tested it.

This is all to say, I don¹t think we should be designing the protocol with
only one architecture in mind. (The architecture of sending the name to a
different processor than the index).

If you have numbers that show that the index is faster I would like to see
under what conditions and architectural assumptions.

Nacho

(I may have misinterpreted your description so feel free to correct me if
I¹m wrong.)

--
Nacho (Ignacio) Solis
Protocol Architect
Principal Scientist
Palo Alto Research Center (PARC)
+1(650)812-4458
Ignacio.Solis at parc.com

On 9/18/14, 12:54 AM, "Massimo Gallo" <massimo.gallo at alcatel-lucent.com>
wrote:

>Indeed each components' offset must be encoded using a fixed amount of
>bytes:
>
>i.e.,
>Type = Offsets
>Length = 10 Bytes
>Value = Offset1(1byte), Offset2(1byte), ...
>
>You may also imagine to have a "Offset_2byte" type if your name is too
>long.
>
>Max
>
>On 18/09/2014 09:27, Tai-Lin Chu wrote:
>>> if you do not need the entire hierarchal structure (suppose you only
>>>want the first x components) you can directly have it using the
>>>offsets. With the Nested TLV structure you have to iteratively parse
>>>the first x-1 components. With the offset structure you cane directly
>>>access to the firs x components.
>> I don't get it. What you described only works if the "offset" is
>> encoded in fixed bytes. With varNum, you will still need to parse x-1
>> offsets to get to the x offset.
>>
>>
>>
>> On Wed, Sep 17, 2014 at 11:57 PM, Massimo Gallo
>> <massimo.gallo at alcatel-lucent.com> wrote:
>>> On 17/09/2014 14:56, Mark Stapp wrote:
>>>> ah, thanks - that's helpful. I thought you were saying "I like the
>>>> existing NDN UTF8 'convention'." I'm still not sure I understand what
>>>>you
>>>> _do_ prefer, though. it sounds like you're describing an entirely
>>>>different
>>>> scheme where the info that describes the name-components is ...
>>>>someplace
>>>> other than _in_ the name-components. is that correct? when you say
>>>>"field
>>>> separator", what do you mean (since that's not a "TL" from a TLV)?
>>> Correct.
>>> In particular, with our name encoding, a TLV indicates the name
>>>hierarchy
>>> with offsets in the name and other TLV(s) indicates the offset to use
>>>in
>>> order to retrieve special components.
>>> As for the field separator, it is something like "/". Aliasing is
>>>avoided as
>>> you do not rely on field separators to parse the name; you use the
>>>"offset
>>> TLV " to do that.
>>>
>>> So now, it may be an aesthetic question but:
>>>
>>> if you do not need the entire hierarchal structure (suppose you only
>>>want
>>> the first x components) you can directly have it using the offsets.
>>>With the
>>> Nested TLV structure you have to iteratively parse the first x-1
>>>components.
>>> With the offset structure you cane directly access to the firs x
>>>components.
>>>
>>> Max
>>>
>>>
>>>> -- Mark
>>>>
>>>> On 9/17/14 6:02 AM, Massimo Gallo wrote:
>>>>> The why is simple:
>>>>>
>>>>> You use a lot of "generic component type" and very few "specific
>>>>> component type". You are imposing types for every component in order
>>>>>to
>>>>> handle few exceptions (segmentation, etc..). You create a rule
>>>>>(specify
>>>>> the component's type ) to handle exceptions!
>>>>>
>>>>> I would prefer not to have typed components. Instead I would prefer
>>>>>to
>>>>> have the name as simple sequence bytes with a field separator. Then,
>>>>> outside the name, if you have some components that could be used at
>>>>> network layer (e.g. a TLV field), you simply need something that
>>>>> indicates which is the offset allowing you to retrieve the version,
>>>>> segment, etc in the name...
>>>>>
>>>>>
>>>>> Max
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 16/09/2014 20:33, Mark Stapp wrote:
>>>>>>
>>>>>>
>>>>>> On 9/16/14 10:29 AM, Massimo Gallo wrote:
>>>>>>>
>>>>>>> I think we agree on the small number of "component types".
>>>>>>> However, if you have a small number of types, you will end up with
>>>>>>> names
>>>>>>> containing many generic components types and few specific
>>>>>>>components
>>>>>>> types. Due to the fact that the component type specification is an
>>>>>>> exception in the name, I would prefer something that specify
>>>>>>> component's
>>>>>>> type only when needed (something like UTF8 conventions but that
>>>>>>> applications MUST use).
>>>>>>>
>>>>>> so ... I can't quite follow that. the thread has had some
>>>>>>explanation
>>>>>> about why the UTF8 requirement has problems (with aliasing, e.g.)
>>>>>>and
>>>>>> there's been email trying to explain that applications don't have to
>>>>>> use types if they don't need to. your email sounds like "I prefer
>>>>>>the
>>>>>> UTF8 convention", but it doesn't say why you have that preference in
>>>>>> the face of the points about the problems. can you say why it is
>>>>>>that
>>>>>> you express a preference for the "convention" with problems ?
>>>>>>
>>>>>> Thanks,
>>>>>> Mark
>>>>>>
>>>>> .
>>>>>
>>>>
>>> _______________________________________________
>>> Ndn-interest mailing list
>>> Ndn-interest at lists.cs.ucla.edu
>>> http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest
>>
>
>_______________________________________________
>Ndn-interest mailing list
>Ndn-interest at lists.cs.ucla.edu
>http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest