[Ndn-interest] any comments on naming convention?

Fri Sep 19 13:21:41 PDT 2014

Let me see if I parse your example:

For /a/b/c/d/e
You want to compute hash(/a/b/c/d/e) and then compute hash(/a/b/c/d/).

Case A (index):

Step 1 - Read index
Step 2 - find 5th component termination
Step 3 - Read memory for /a/b/c/d/e
Step 4 - Compute hash of /a/b/c/d/e

Step 5 - Read index (cached)
Step 6 - find 4th component
Step 3 - Read memory for /a/b/c/d  (cached)
Step 4 - Compute hash of /a/b/c/d

Case B1 (no index)

Step 1a - Read component /a, parse
Step 1b - Read component /b, parse
Step 1c - Read component /c, parse
Step 1d - Read component /d, parse

Step 1e - Read component /e, parse

Step 2 - compute hash of /a/b/c/d/e

Step 3a - Read component /a, parse (cached)
Step 3b - Read component /b, parse (cached)
Step 3c - Read component /c, parse (cached)
Step 3d - Read component /d, parse (cached)

Step 4 - compute hash of /a/b/c/d

Case B2 (no index, compute hash along the way)

Step 1 - Read component /a, parse
Step 2 - compute hash of /a
Step 3 - Read component /b, parse
Step 4 - compute hash of /a/b
Step 5 - Read component /c, parse
Step 6 - compute hash of /a/b/c
Step 7 - Read component /d, parse
Step 8 - compute hash of /a/b/c/d
Step 9 - Read component /e, parse
Step 10 - compute hash of /a/b/c/d/e

Your argument is that Case A might be faster because Case B1 needs to
reparse and Case B2 needs to recompute the hash.

In Case B1, the reparse is actually really fast if the name is in cache.
Hitting the L1 cache costs about 4 cycles (in something like the Core i7
processors) and the parsing and add are basically single cycle
instructions.  A L2 cache miss is at least 40 cycles in the best of
conditions (hitting a unshared L3 cache line) and much more if it has to
go to ram.

In the case of B2, this may not be that high of a cost depending on the
hashing algorithm.  After all, the algorithm has to go through all the
bytes.

For small names that don’t need multiple cache line sizes this won’t
matter, everything will be small, everything will be fast.   For large
names where you go through a lot of memory you want to optimize the access
to that memory.

Anyway, I’m not an expert in any of this, so I may be mistaken, but I fail
to see the big gains of the index.

So my question is, why add the complexity of the index external to the
name?  Are you really saving that much? Do you have any results for this?
Because, if it’s a matter of saving time, why not just pre-compute the
hashes and send them along with the packet so they don’t need to be
computed again?

Nacho

--
Nacho (Ignacio) Solis
Protocol Architect
Principal Scientist
Palo Alto Research Center (PARC)
+1(650)812-4458
Ignacio.Solis at parc.com

On 9/19/14, 4:00 AM, "Massimo Gallo" <massimo.gallo at alcatel-lucent.com>
wrote:

>Hi Nacho,
>
>I agree that the protocol design should not be done with one single
>architecture in mind. Indeed, different architecture may have different
>cache line sizes, memory latency, etc... So I think the performance of
>the two name encoding is really architecture and implementation dependent.
>
>For example:
>Let say you need first the name up to the 5th component, then  the name
>up to the 4th component:
>
>- In the first way you described to get the name up to the 4th
>component, the second time, you already have the needed data in the
>L1/L2 cache.
>
>- In the second way you may also have all you need in memory (name can
>be longer with T and L in the middle) but, you need to iterate again to
>get the name up to the 4th component. Now, in this case you can improve
>your performance by computing  hash values while you parse the name.
>However, you are spending time in computing hash values (very
>computationally expensive) that you may not need.
>
>So based on the implementation you can have better performance with one
>or the other.
>
>Max
>
>
>On 18/09/2014 23:02, Ignacio.Solis at parc.com wrote:
>> Does this make that much difference?
>>
>> If you want to parse the first 5 components.  One way to do it is:
>>
>> Read the index, find entry 5, then read in that many bytes from the
>>start
>> offset of the beginning of the name.
>> OR
>> Start reading name, (find size + move ) 5 times.
>>
>> How much speed are you getting from one to the other?  You seem to imply
>> that the first one is faster.  I don¹t think this is the case.
>>
>> In the first one you¹ll probably have to get the cache line for the
>>index,
>> then all the required cache lines for the first 5 components.  For the
>> second, you¹ll have to get all the cache lines for the first 5
>>components.
>>    Given an assumption that a cache miss is way more expensive than
>> evaluating a number and computing an addition, you might find that the
>> performance of the index is actually slower than the performance of the
>> direct access.
>>
>> Granted, there is a case where you don¹t access the name at all, for
>> example, if you just get the offsets and then send the offsets as
>> parameters to another processor/GPU/NPU/etc.  In this case you may see a
>> gain IF there are more cache line misses in reading the name than in
>> reading the index.   So, if the regular part of the name that you¹re
>> parsing is bigger than the cache line (64 bytes?) and the name is to be
>> processed by a different processor, then your might see some performance
>> gain in using the index, but in all other circumstances I bet this is
>>not
>> the case.   I may be wrong, haven¹t actually tested it.
>>
>> This is all to say, I don¹t think we should be designing the protocol
>>with
>> only one architecture in mind. (The architecture of sending the name to
>>a
>> different processor than the index).
>>
>> If you have numbers that show that the index is faster I would like to
>>see
>> under what conditions and architectural assumptions.
>>
>> Nacho
>>
>> (I may have misinterpreted your description so feel free to correct me
>>if
>> I¹m wrong.)
>>
>>
>> --
>> Nacho (Ignacio) Solis
>> Protocol Architect
>> Principal Scientist
>> Palo Alto Research Center (PARC)
>> +1(650)812-4458
>> Ignacio.Solis at parc.com
>>
>>
>>
>>
>>
>> On 9/18/14, 12:54 AM, "Massimo Gallo" <massimo.gallo at alcatel-lucent.com>
>> wrote:
>>
>>> Indeed each components' offset must be encoded using a fixed amount of
>>> bytes:
>>>
>>> i.e.,
>>> Type = Offsets
>>> Length = 10 Bytes
>>> Value = Offset1(1byte), Offset2(1byte), ...
>>>
>>> You may also imagine to have a "Offset_2byte" type if your name is too
>>> long.
>>>
>>> Max
>>>
>>> On 18/09/2014 09:27, Tai-Lin Chu wrote:
>>>>> if you do not need the entire hierarchal structure (suppose you only
>>>>> want the first x components) you can directly have it using the
>>>>> offsets. With the Nested TLV structure you have to iteratively parse
>>>>> the first x-1 components. With the offset structure you cane directly
>>>>> access to the firs x components.
>>>> I don't get it. What you described only works if the "offset" is
>>>> encoded in fixed bytes. With varNum, you will still need to parse x-1
>>>> offsets to get to the x offset.
>>>>
>>>>
>>>>
>>>> On Wed, Sep 17, 2014 at 11:57 PM, Massimo Gallo
>>>> <massimo.gallo at alcatel-lucent.com> wrote:
>>>>> On 17/09/2014 14:56, Mark Stapp wrote:
>>>>>> ah, thanks - that's helpful. I thought you were saying "I like the
>>>>>> existing NDN UTF8 'convention'." I'm still not sure I understand
>>>>>>what
>>>>>> you
>>>>>> _do_ prefer, though. it sounds like you're describing an entirely
>>>>>> different
>>>>>> scheme where the info that describes the name-components is ...
>>>>>> someplace
>>>>>> other than _in_ the name-components. is that correct? when you say
>>>>>> "field
>>>>>> separator", what do you mean (since that's not a "TL" from a TLV)?
>>>>> Correct.
>>>>> In particular, with our name encoding, a TLV indicates the name
>>>>> hierarchy
>>>>> with offsets in the name and other TLV(s) indicates the offset to use
>>>>> in
>>>>> order to retrieve special components.
>>>>> As for the field separator, it is something like "/". Aliasing is
>>>>> avoided as
>>>>> you do not rely on field separators to parse the name; you use the
>>>>> "offset
>>>>> TLV " to do that.
>>>>>
>>>>> So now, it may be an aesthetic question but:
>>>>>
>>>>> if you do not need the entire hierarchal structure (suppose you only
>>>>> want
>>>>> the first x components) you can directly have it using the offsets.
>>>>> With the
>>>>> Nested TLV structure you have to iteratively parse the first x-1
>>>>> components.
>>>>> With the offset structure you cane directly access to the firs x
>>>>> components.
>>>>>
>>>>> Max
>>>>>
>>>>>
>>>>>> -- Mark
>>>>>>
>>>>>> On 9/17/14 6:02 AM, Massimo Gallo wrote:
>>>>>>> The why is simple:
>>>>>>>
>>>>>>> You use a lot of "generic component type" and very few "specific
>>>>>>> component type". You are imposing types for every component in
>>>>>>>order
>>>>>>> to
>>>>>>> handle few exceptions (segmentation, etc..). You create a rule
>>>>>>> (specify
>>>>>>> the component's type ) to handle exceptions!
>>>>>>>
>>>>>>> I would prefer not to have typed components. Instead I would prefer
>>>>>>> to
>>>>>>> have the name as simple sequence bytes with a field separator.
>>>>>>>Then,
>>>>>>> outside the name, if you have some components that could be used at
>>>>>>> network layer (e.g. a TLV field), you simply need something that
>>>>>>> indicates which is the offset allowing you to retrieve the version,
>>>>>>> segment, etc in the name...
>>>>>>>
>>>>>>>
>>>>>>> Max
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 16/09/2014 20:33, Mark Stapp wrote:
>>>>>>>>
>>>>>>>> On 9/16/14 10:29 AM, Massimo Gallo wrote:
>>>>>>>>> I think we agree on the small number of "component types".
>>>>>>>>> However, if you have a small number of types, you will end up
>>>>>>>>>with
>>>>>>>>> names
>>>>>>>>> containing many generic components types and few specific
>>>>>>>>> components
>>>>>>>>> types. Due to the fact that the component type specification is
>>>>>>>>>an
>>>>>>>>> exception in the name, I would prefer something that specify
>>>>>>>>> component's
>>>>>>>>> type only when needed (something like UTF8 conventions but that
>>>>>>>>> applications MUST use).
>>>>>>>>>
>>>>>>>> so ... I can't quite follow that. the thread has had some
>>>>>>>> explanation
>>>>>>>> about why the UTF8 requirement has problems (with aliasing, e.g.)
>>>>>>>> and
>>>>>>>> there's been email trying to explain that applications don't have
>>>>>>>>to
>>>>>>>> use types if they don't need to. your email sounds like "I prefer
>>>>>>>> the
>>>>>>>> UTF8 convention", but it doesn't say why you have that preference
>>>>>>>>in
>>>>>>>> the face of the points about the problems. can you say why it is
>>>>>>>> that
>>>>>>>> you express a preference for the "convention" with problems ?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Mark
>>>>>>>>
>>>>>>> .
>>>>>>>
>>>>> _______________________________________________
>>>>> Ndn-interest mailing list
>>>>> Ndn-interest at lists.cs.ucla.edu
>>>>> http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest
>>> _______________________________________________
>>> Ndn-interest mailing list
>>> Ndn-interest at lists.cs.ucla.edu
>>> http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest
>>
>
>_______________________________________________
>Ndn-interest mailing list
>Ndn-interest at lists.cs.ucla.edu
>http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest