[Ndn-interest] any comments on naming convention?

Wed Sep 24 23:18:34 PDT 2014

http://irl.cs.ucla.edu/~zhenkai/papers/chronosync.pdf

https://www.ccnx.org/releases/ccnx-0.7.0rc1/doc/technical/SynchronizationPr
otocol.html

J. 

On 9/24/14, 11:16 PM, "Tai-Lin Chu" <tailinchu at gmail.com> wrote:

>However, I cannot see whether we can achieve "best-effort *all*-value"
>efficiently.
>There are still interesting topics on
>1. how do we express the discovery query?
>2. is selector "discovery-complete"? i. e. can we express any
>discovery query with current selector?
>3. if so, can we re-express current selector in a more efficient way?
>
>I personally see a named data as a set, which can then be categorized
>into "ordered set", and "unordered set".
>some questions that any discovery expression must solve:
>1. is this a nil set or not? nil set means that this name is the leaf
>2. set contains member X?
>3. is set ordered or not
>4. (ordered) first, prev, next, last
>5. if we enforce component ordering, answer question 4.
>6. recursively answer all questions above on any set member
>
>
>
>On Wed, Sep 24, 2014 at 10:45 PM, Burke, Jeff <jburke at remap.ucla.edu>
>wrote:
>>
>>
>> From: <Marc.Mosko at parc.com>
>> Date: Wed, 24 Sep 2014 16:25:53 +0000
>> To: Jeff Burke <jburke at remap.ucla.edu>
>> Cc: <Ignacio.Solis at parc.com>, <tailinchu at gmail.com>,
>> <ndn-interest at lists.cs.ucla.edu>
>> Subject: Re: [Ndn-interest] any comments on naming convention?
>>
>> I think Tai-Lin’s example was just fine to talk about discovery.
>> /blah/blah/value, how do you discover all the “value”s?  Discovery
>>shouldn’t
>> care if its email messages or temperature readings or world cup photos.
>>
>>
>> This is true if discovery means "finding everything" - in which case,
>>as you
>> point out, sync-style approaches may be best.  But I am not sure that
>>this
>> definition is complete.  The most pressing example that I can think of
>>is
>> best-effort latest-value, in which the consumer's goal is to get the
>>latest
>> copy the network can deliver at the moment, and may not care about
>>previous
>> values or (if freshness is used well) potential later versions.
>>
>> Another case that seems to work well is video seeking.  Let's say I
>>want to
>> enable random access to a video by timecode. The publisher can provide a
>> time-code based discovery namespace that's queried using an Interest
>>that
>> essentially says "give me the closest keyframe to 00:37:03:12", which
>> returns an interest that, via the name, provides the exact timecode of
>>the
>> keyframe in question and a link to a segment-based namespace for
>>efficient
>> exact match playout.  In two roundtrips and in a very lightweight way,
>>the
>> consumer has random access capability.   If the NDN is the moral
>>equivalent
>> of IP, then I am not sure we should be afraid of roundtrips that provide
>> this kind of functionality, just as they are used in TCP.
>>
>>
>> I described one set of problems using the exclusion approach, and that
>>an
>> NDN paper on device discovery described a similar problem, though they
>>did
>> not go into the details of splitting interests, etc.  That all was
>>simple
>> enough to see from the example.
>>
>> Another question is how does one do the discovery with exact match
>>names,
>> which is also conflating things.  You could do a different discovery
>>with
>> continuation names too, just not the exclude method.
>>
>> As I alluded to, one needs a way to talk with a specific cache about its
>> “table of contents” for a prefix so one can get a consistent set of
>>results
>> without all the round-trips of exclusions.  Actually downloading the
>> “headers” of the messages would be the same bytes, more or less.  In a
>>way,
>> this is a little like name enumeration from a ccnx 0.x repo, but that
>> protocol has its own set of problems and I’m not suggesting to use that
>> directly.
>>
>> One approach is to encode a request in a name component and a
>>participating
>> cache can reply.  It replies in such a way that one could continue
>>talking
>> with that cache to get its TOC.  One would then issue another interest
>>with
>> a request for not-that-cache.
>>
>>
>> I'm curious how the TOC approach works in a multi-publisher scenario?
>>
>>
>> Another approach is to try to ask the authoritative source for the
>>“current”
>> manifest name, i.e. /mail/inbox/current/<nonce>, which could return the
>> manifest or a link to the manifest.  Then fetching the actual manifest
>>from
>> the link could come from caches because you how have a consistent set of
>> names to ask for.  If you cannot talk with an authoritative source, you
>> could try again without the nonce and see if there’s a cached copy of a
>> recent version around.
>>
>> Marc
>>
>>
>> On Sep 24, 2014, at 5:46 PM, Burke, Jeff <jburke at remap.ucla.edu> wrote:
>>
>>
>>
>> On 9/24/14, 8:20 AM, "Ignacio.Solis at parc.com" <Ignacio.Solis at parc.com>
>> wrote:
>>
>> On 9/24/14, 4:27 AM, "Tai-Lin Chu" <tailinchu at gmail.com> wrote:
>>
>> For example, I see a pattern /mail/inbox/148. I, a human being, see a
>> pattern with static (/mail/inbox) and variable (148) components; with
>> proper naming convention, computers can also detect this pattern
>> easily. Now I want to look for all mails in my inbox. I can generate a
>> list of /mail/inbox/<number>. These are my guesses, and with selectors
>> I can further refine my guesses.
>>
>>
>> I think this is a very bad example  (or at least a very bad application
>> design).  You have an app (a mail server / inbox) and you want it to
>>list
>> your emails?  An email list is an application data structure.  I don’t
>> think you should use the network structure to reflect this.
>>
>>
>> I think Tai-Lin is trying to sketch a small example, not propose a
>> full-scale approach to email. (Maybe I am misunderstanding.)
>>
>>
>> Another way to look at it is that if the network architecture is
>>providing
>> the equivalent of distributed storage to the application, perhaps the
>> application data structure could be adapted to match the affordances of
>> the network.  Then it would not be so bad that the two structures were
>> aligned.
>>
>>
>> I’ll give you an example, how do you delete emails from your inbox? If
>>an
>> email was cached in the network it can never be deleted from your inbox?
>>
>>
>> This is conflating two issues - what you are pointing out is that the
>>data
>> structure of a linear list doesn't handle common email management
>> operations well.  Again, I'm not sure if that's what he was getting at
>> here.  But deletion is not the issue - the availability of a data object
>> on the network does not necessarily mean it's valid from the perspective
>> of the application.
>>
>> Or moved to another mailbox? Do you rely on the emails expiring?
>>
>> This problem is true for most (any?) situations where you use network
>>name
>> structure to directly reflect the application data structure.
>>
>>
>> Not sure I understand how you make the leap from the example to the
>> general statement.
>>
>> Jeff
>>
>>
>>
>>
>> Nacho
>>
>>
>>
>> On Tue, Sep 23, 2014 at 2:34 AM,  <Marc.Mosko at parc.com> wrote:
>>
>> Ok, yes I think those would all be good things.
>>
>> One thing to keep in mind, especially with things like time series
>> sensor
>> data, is that people see a pattern and infer a way of doing it.  That’s
>> easy
>> for a human :)  But in Discovery, one should assume that one does not
>> know
>> of patterns in the data beyond what the protocols used to publish the
>> data
>> explicitly require.  That said, I think some of the things you listed
>> are
>> good places to start: sensor data, web content, climate data or genome
>> data.
>>
>> We also need to state what the forwarding strategies are and what the
>> cache
>> behavior is.
>>
>> I outlined some of the points that I think are important in that other
>> posting.  While “discover latest” is useful, “discover all” is also
>> important, and that one gets complicated fast.  So points like
>> separating
>> discovery from retrieval and working with large data sets have been
>> important in shaping our thinking.  That all said, I’d be happy
>> starting
>> from 0 and working through the Discovery service definition from
>> scratch
>> along with data set use cases.
>>
>> Marc
>>
>> On Sep 23, 2014, at 12:36 AM, Burke, Jeff <jburke at remap.ucla.edu>
>> wrote:
>>
>> Hi Marc,
>>
>> Thanks  yes, I saw that as well. I was just trying to get one step
>> more
>> specific, which was to see if we could identify a few specific use
>> cases
>> around which to have the conversation.  (e.g., time series sensor data
>> and
>> web content retrieval for "get latest"; climate data for huge data
>> sets;
>> local data in a vehicular network; etc.)  What have you been looking at
>> that's driving considerations of discovery?
>>
>> Thanks,
>> Jeff
>>
>> From: <Marc.Mosko at parc.com>
>> Date: Mon, 22 Sep 2014 22:29:43 +0000
>> To: Jeff Burke <jburke at remap.ucla.edu>
>> Cc: <tailinchu at gmail.com>, <ndn-interest at lists.cs.ucla.edu>
>> Subject: Re: [Ndn-interest] any comments on naming convention?
>>
>> Jeff,
>>
>> Take a look at my posting (that Felix fixed) in a new thread on
>> Discovery.
>>
>>
>> http://www.lists.cs.ucla.edu/pipermail/ndn-interest/2014-September/00020
>> 0
>> .html
>>
>> I think it would be very productive to talk about what Discovery should
>> do,
>> and not focus on the how.  It is sometimes easy to get caught up in the
>> how,
>> which I think is a less important topic than the what at this stage.
>>
>> Marc
>>
>> On Sep 22, 2014, at 11:04 PM, Burke, Jeff <jburke at remap.ucla.edu>
>> wrote:
>>
>> Marc,
>>
>> If you can't talk about your protocols, perhaps we can discuss this
>> based
>> on use cases.   What are the use cases you are using to evaluate
>> discovery?
>>
>> Jeff
>>
>>
>>
>> On 9/21/14, 11:23 AM, "Marc.Mosko at parc.com" <Marc.Mosko at parc.com>
>> wrote:
>>
>> No matter what the expressiveness of the predicates if the forwarder
>> can
>> send interests different ways you don't have a consistent underlying
>> set
>> to talk about so you would always need non-range exclusions to discover
>> every version.
>>
>> Range exclusions only work I believe if you get an authoritative
>> answer.
>> If different content pieces are scattered between different caches I
>> don't see how range exclusions would work to discover every version.
>>
>> I'm sorry to be pointing out problems without offering solutions but
>> we're not ready to publish our discovery protocols.
>>
>> Sent from my telephone
>>
>> On Sep 21, 2014, at 8:50, "Tai-Lin Chu" <tailinchu at gmail.com> wrote:
>>
>> I see. Can you briefly describe how ccnx discovery protocol solves the
>> all problems that you mentioned (not just exclude)? a doc will be
>> better.
>>
>> My unserious conjecture( :) ) : exclude is equal to [not]. I will soon
>> expect [and] and [or], so boolean algebra is fully supported.  Regular
>> language or context free language might become part of selector too.
>>
>> On Sat, Sep 20, 2014 at 11:25 PM,  <Marc.Mosko at parc.com> wrote:
>> That will get you one reading then you need to exclude it and ask
>> again.
>>
>> Sent from my telephone
>>
>> On Sep 21, 2014, at 8:22, "Tai-Lin Chu" <tailinchu at gmail.com> wrote:
>>
>> Yes, my point was that if you cannot talk about a consistent set
>> with a particular cache, then you need to always use individual
>> excludes not range excludes if you want to discover all the versions
>> of an object.
>>
>>
>> I am very confused. For your example, if I want to get all today's
>> sensor data, I just do (Any..Last second of last day)(First second of
>> tomorrow..Any). That's 18 bytes.
>>
>>
>> [1]http://named-data.net/doc/ndn-tlv/interest.html#exclude
>>
>> On Sat, Sep 20, 2014 at 10:55 PM,  <Marc.Mosko at parc.com> wrote:
>>
>> On Sep 21, 2014, at 1:47 AM, Tai-Lin Chu <tailinchu at gmail.com> wrote:
>>
>> If you talk sometimes to A and sometimes to B, you very easily
>> could miss content objects you want to discovery unless you avoid
>> all range exclusions and only exclude explicit versions.
>>
>>
>> Could you explain why missing content object situation happens? also
>> range exclusion is just a shorter notation for many explicit
>> exclude;
>> converting from explicit excludes to ranged exclude is always
>> possible.
>>
>>
>> Yes, my point was that if you cannot talk about a consistent set
>> with a particular cache, then you need to always use individual
>> excludes not range excludes if you want to discover all the versions
>> of an object.  For something like a sensor reading that is updated,
>> say, once per second you will have 86,400 of them per day.  If each
>> exclusion is a timestamp (say 8 bytes), that’s 691,200 bytes of
>> exclusions (plus encoding overhead) per day.
>>
>> yes, maybe using a more deterministic version number than a
>> timestamp makes sense here, but its just an example of needing a lot
>> of exclusions.
>>
>>
>> You exclude through 100 then issue a new interest.  This goes to
>> cache B
>>
>>
>> I feel this case is invalid because cache A will also get the
>> interest, and cache A will return v101 if it exists. Like you said,
>> if
>> this goes to cache B only, it means that cache A dies. How do you
>> know
>> that v101 even exist?
>>
>>
>> I guess this depends on what the forwarding strategy is.  If the
>> forwarder will always send each interest to all replicas, then yes,
>> modulo packet loss, you would discover v101 on cache A.  If the
>> forwarder is just doing “best path” and can round-robin between cache
>> A and cache B, then your application could miss v101.
>>
>>
>>
>> c,d In general I agree that LPM performance is related to the number
>> of components. In my own thread-safe LMP implementation, I used only
>> one RWMutex for the whole tree. I don't know whether adding lock for
>> every node will be faster or not because of lock overhead.
>>
>> However, we should compare (exact match + discovery protocol) vs
>> (ndn
>> lpm). Comparing performance of exact match to lpm is unfair.
>>
>>
>> Yes, we should compare them.  And we need to publish the ccnx 1.0
>> specs for doing the exact match discovery.  So, as I said, I’m not
>> ready to claim its better yet because we have not done that.
>>
>>
>>
>>
>>
>>
>> On Sat, Sep 20, 2014 at 2:38 PM,  <Marc.Mosko at parc.com> wrote:
>> I would point out that using LPM on content object to Interest
>> matching to do discovery has its own set of problems.  Discovery
>> involves more than just “latest version” discovery too.
>>
>> This is probably getting off-topic from the original post about
>> naming conventions.
>>
>> a.  If Interests can be forwarded multiple directions and two
>> different caches are responding, the exclusion set you build up
>> talking with cache A will be invalid for cache B.  If you talk
>> sometimes to A and sometimes to B, you very easily could miss
>> content objects you want to discovery unless you avoid all range
>> exclusions and only exclude explicit versions.  That will lead to
>> very large interest packets.  In ccnx 1.0, we believe that an
>> explicit discovery protocol that allows conversations about
>> consistent sets is better.
>>
>> b. Yes, if you just want the “latest version” discovery that
>> should be transitive between caches, but imagine this.  You send
>> Interest #1 to cache A which returns version 100.  You exclude
>> through 100 then issue a new interest.  This goes to cache B who
>> only has version 99, so the interest times out or is NACK’d.  So
>> you think you have it!  But, cache A already has version 101, you
>> just don’t know.   If you cannot have a conversation around
>> consistent sets, it seems like even doing latest version discovery
>> is difficult with selector based discovery.  From what I saw in
>> ccnx 0.x, one ended up getting an Interest all the way to the
>> authoritative source because you can never believe an intermediate
>> cache that there’s not something more recent.
>>
>> I’m sure you’ve walked through cases (a) and (b) in ndn, I’d be
>> interest in seeing your analysis.  Case (a) is that a node can
>> correctly discover every version of a name prefix, and (b) is that
>> a node can correctly discover the latest version.  We have not
>> formally compared (or yet published) our discovery protocols (we
>> have three, 2 for content, 1 for device) compared to selector based
>> discovery, so I cannot yet claim they are better, but they do not
>> have the non-determinism sketched above.
>>
>> c. Using LPM, there is a non-deterministic number of lookups you
>> must do in the PIT to match a content object.  If you have a name
>> tree or a threaded hash table, those don’t all need to be hash
>> lookups, but you need to walk up the name tree for every prefix of
>> the content object name and evaluate the selector predicate.
>> Content Based Networking (CBN) had some some methods to create data
>> structures based on predicates, maybe those would be better.  But
>> in any case, you will potentially need to retrieve many PIT entries
>> if there is Interest traffic for many prefixes of a root.  Even on
>> an Intel system, you’ll likely miss cache lines, so you’ll have a
>> lot of NUMA access for each one.  In CCNx 1.0, even a naive
>> implementation only requires at most 3 lookups (one by name, one by
>> name + keyid, one by name + content object hash), and one can do
>> other things to optimize lookup for an extra write.
>>
>> d. In (c) above, if you have a threaded name tree or are just
>> walking parent pointers, I suspect you’ll need locking of the
>> ancestors in a multi-threaded system (“threaded" here meaning LWP)
>> and that will be expensive.  It would be interesting to see what a
>> cache consistent multi-threaded name tree looks like.
>>
>> Marc
>>
>>
>> On Sep 20, 2014, at 8:15 PM, Tai-Lin Chu <tailinchu at gmail.com>
>> wrote:
>>
>> I had thought about these questions, but I want to know your idea
>> besides typed component:
>> 1. LPM allows "data discovery". How will exact match do similar
>> things?
>> 2. will removing selectors improve performance? How do we use
>> other
>> faster technique to replace selector?
>> 3. fixed byte length and type. I agree more that type can be fixed
>> byte, but 2 bytes for length might not be enough for future.
>>
>>
>> On Sat, Sep 20, 2014 at 5:36 AM, Dave Oran (oran)
>> <oran at cisco.com> wrote:
>>
>> On Sep 18, 2014, at 9:09 PM, Tai-Lin Chu <tailinchu at gmail.com>
>> wrote:
>>
>> I know how to make #2 flexible enough to do what things I can
>> envision we need to do, and with a few simple conventions on
>> how the registry of types is managed.
>>
>>
>> Could you share it with us?
>>
>> Sure. Here’s a strawman.
>>
>> The type space is 16 bits, so you have 65,565 types.
>>
>> The type space is currently shared with the types used for the
>> entire protocol, that gives us two options:
>> (1) we reserve a range for name component types. Given the
>> likelihood there will be at least as much and probably more need
>> to component types than protocol extensions, we could reserve 1/2
>> of the type space, giving us 32K types for name components.
>> (2) since there is no parsing ambiguity between name components
>> and other fields of the protocol (sine they are sub-types of the
>> name type) we could reuse numbers and thereby have an entire 65K
>> name component types.
>>
>> We divide the type space into regions, and manage it with a
>> registry. If we ever get to the point of creating an IETF
>> standard, IANA has 25 years of experience running registries and
>> there are well-understood rule sets for different kinds of
>> registries (open, requires a written spec, requires standards
>> approval).
>>
>> - We allocate one “default" name component type for “generic
>> name”, which would be used on name prefixes and other common
>> cases where there are no special semantics on the name component.
>> - We allocate a range of name component types, say 1024, to
>> globally understood types that are part of the base or extension
>> NDN specifications (e.g. chunk#, version#, etc.
>> - We reserve some portion of the space for unanticipated uses
>> (say another 1024 types)
>> - We give the rest of the space to application assignment.
>>
>> Make sense?
>>
>>
>> While I’m sympathetic to that view, there are three ways in
>> which Moore’s law or hardware tricks will not save us from
>> performance flaws in the design
>>
>>
>> we could design for performance,
>>
>> That’s not what people are advocating. We are advocating that we
>> *not* design for known bad performance and hope serendipity or
>> Moore’s Law will come to the rescue.
>>
>> but I think there will be a turning
>> point when the slower design starts to become "fast enough”.
>>
>> Perhaps, perhaps not. Relative performance is what matters so
>> things that don’t get faster while others do tend to get dropped
>> or not used because they impose a performance penalty relative to
>> the things that go faster. There is also the “low-end” phenomenon
>> where impovements in technology get applied to lowering cost
>> rather than improving performance. For those environments bad
>> performance just never get better.
>>
>> Do you
>> think there will be some design of ndn that will *never* have
>> performance improvement?
>>
>> I suspect LPM on data will always be slow (relative to the other
>> functions).
>> i suspect exclusions will always be slow because they will
>> require extra memory references.
>>
>> However I of course don’t claim to clairvoyance so this is just
>> speculation based on 35+ years of seeing performance improve by 4
>> orders of magnitude and still having to worry about counting
>> cycles and memory references…
>>
>> On Thu, Sep 18, 2014 at 5:20 PM, Dave Oran (oran)
>> <oran at cisco.com> wrote:
>>
>> On Sep 18, 2014, at 7:41 PM, Tai-Lin Chu
>> <tailinchu at gmail.com> wrote:
>>
>> We should not look at a certain chip nowadays and want ndn to
>> perform
>> well on it. It should be the other way around: once  ndn app
>> becomes
>> popular, a better chip will be designed for ndn.
>>
>> While I’m sympathetic to that view, there are three ways in
>> which Moore’s law or hardware tricks will not save us from
>> performance flaws in the design:
>> a) clock rates are not getting (much) faster
>> b) memory accesses are getting (relatively) more expensive
>> c) data structures that require locks to manipulate
>> successfully will be relatively more expensive, even with
>> near-zero lock contention.
>>
>> The fact is, IP *did* have some serious performance flaws in
>> its design. We just forgot those because the design elements
>> that depended on those mistakes have fallen into disuse. The
>> poster children for this are:
>> 1. IP options. Nobody can use them because they are too slow
>> on modern forwarding hardware, so they can’t be reliably used
>> anywhere
>> 2. the UDP checksum, which was a bad design when it was
>> specified and is now a giant PITA that still causes major pain
>> in working around.
>>
>> I’m afraid students today are being taught the that designers
>> of IP were flawless, as opposed to very good scientists and
>> engineers that got most of it right.
>>
>> I feel the discussion today and yesterday has been off-topic.
>> Now I
>> see that there are 3 approaches:
>> 1. we should not define a naming convention at all
>> 2. typed component: use tlv type space and add a handful of
>> types
>> 3. marked component: introduce only one more type and add
>> additional
>> marker space
>>
>> I know how to make #2 flexible enough to do what things I can
>> envision we need to do, and with a few simple conventions on
>> how the registry of types is managed.
>>
>> It is just as powerful in practice as either throwing up our
>> hands and letting applications design their own mutually
>> incompatible schemes or trying to make naming conventions with
>> markers in a way that is fast to generate/parse and also
>> resilient against aliasing.
>>
>> Also everybody thinks that the current utf8 marker naming
>> convention
>> needs to be revised.
>>
>>
>>
>> On Thu, Sep 18, 2014 at 3:27 PM, Felix Rabe <felix at rabe.io>
>> wrote:
>> Would that chip be suitable, i.e. can we expect most names
>> to fit in (the
>> magnitude of) 96 bytes? What length are names usually in
>> current NDN
>> experiments?
>>
>> I guess wide deployment could make for even longer names.
>> Related: Many URLs
>> I encounter nowadays easily don't fit within two 80-column
>> text lines, and
>> NDN will have to carry more information than URLs, as far as
>> I see.
>>
>>
>> On 18/Sep/14 23:15, Marc.Mosko at parc.com wrote:
>>
>> In fact, the index in separate TLV will be slower on some
>> architectures,
>> like the ezChip NP4.  The NP4 can hold the fist 96 frame
>> bytes in memory,
>> then any subsequent memory is accessed only as two adjacent
>> 32-byte blocks
>> (there can be at most 5 blocks available at any one time).
>> If you need to
>> switch between arrays, it would be very expensive.  If you
>> have to read past
>> the name to get to the 2nd array, then read it, then backup
>> to get to the
>> name, it will be pretty expensive too.
>>
>> Marc
>>
>> On Sep 18, 2014, at 2:02 PM, <Ignacio.Solis at parc.com>
>> <Ignacio.Solis at parc.com> wrote:
>>
>> Does this make that much difference?
>>
>> If you want to parse the first 5 components. One way to do
>> it is:
>>
>> Read the index, find entry 5, then read in that many bytes
>> from the start
>> offset of the beginning of the name.
>> OR
>> Start reading name, (find size + move ) 5 times.
>>
>> How much speed are you getting from one to the other?  You
>> seem to imply
>> that the first one is faster.  I don¹t think this is the
>> case.
>>
>> In the first one you¹ll probably have to get the cache line
>> for the index,
>> then all the required cache lines for the first 5
>> components. For the
>> second, you¹ll have to get all the cache lines for the first
>> 5 components.
>> Given an assumption that a cache miss is way more expensive
>> than
>> evaluating a number and computing an addition, you might
>> find that the
>> performance of the index is actually slower than the
>> performance of the
>> direct access.
>>
>> Granted, there is a case where you don¹t access the name at
>> all, for
>> example, if you just get the offsets and then send the
>> offsets as
>> parameters to another processor/GPU/NPU/etc. In this case
>> you may see a
>> gain IF there are more cache line misses in reading the name
>> than in
>> reading the index.   So, if the regular part of the name
>> that you¹re
>> parsing is bigger than the cache line (64 bytes?) and the
>> name is to be
>> processed by a different processor, then your might see some
>> performance
>> gain in using the index, but in all other circumstances I
>> bet this is not
>> the case.   I may be wrong, haven¹t actually tested it.
>>
>> This is all to say, I don¹t think we should be designing the
>> protocol with
>> only one architecture in mind. (The architecture of sending
>> the name to a
>> different processor than the index).
>>
>> If you have numbers that show that the index is faster I
>> would like to see
>> under what conditions and architectural assumptions.
>>
>> Nacho
>>
>> (I may have misinterpreted your description so feel free to
>> correct me if
>> I¹m wrong.)
>>
>>
>> --
>> Nacho (Ignacio) Solis
>> Protocol Architect
>> Principal Scientist
>> Palo Alto Research Center (PARC)
>> +1(650)812-4458
>> Ignacio.Solis at parc.com
>>
>>
>>
>>
>>
>> On 9/18/14, 12:54 AM, "Massimo Gallo"
>> <massimo.gallo at alcatel-lucent.com>
>> wrote:
>>
>> Indeed each components' offset must be encoded using a fixed
>> amount of
>> bytes:
>>
>> i.e.,
>> Type = Offsets
>> Length = 10 Bytes
>> Value = Offset1(1byte), Offset2(1byte), ...
>>
>> You may also imagine to have a "Offset_2byte" type if your
>> name is too
>> long.
>>
>> Max
>>
>> On 18/09/2014 09:27, Tai-Lin Chu wrote:
>>
>> if you do not need the entire hierarchal structure (suppose
>> you only
>> want the first x components) you can directly have it using
>> the
>> offsets. With the Nested TLV structure you have to
>> iteratively parse
>> the first x-1 components. With the offset structure you cane
>> directly
>> access to the firs x components.
>>
>> I don't get it. What you described only works if the
>> "offset" is
>> encoded in fixed bytes. With varNum, you will still need to
>> parse x-1
>> offsets to get to the x offset.
>>
>>
>>
>> On Wed, Sep 17, 2014 at 11:57 PM, Massimo Gallo
>> <massimo.gallo at alcatel-lucent.com> wrote:
>>
>> On 17/09/2014 14:56, Mark Stapp wrote:
>>
>> ah, thanks - that's helpful. I thought you were saying "I
>> like the
>> existing NDN UTF8 'convention'." I'm still not sure I
>> understand what
>> you
>> _do_ prefer, though. it sounds like you're describing an
>> entirely
>> different
>> scheme where the info that describes the name-components is
>> ...
>> someplace
>> other than _in_ the name-components. is that correct? when
>> you say
>> "field
>> separator", what do you mean (since that's not a "TL" from a
>> TLV)?
>>
>> Correct.
>> In particular, with our name encoding, a TLV indicates the
>> name
>> hierarchy
>> with offsets in the name and other TLV(s) indicates the
>> offset to use
>> in
>> order to retrieve special components.
>> As for the field separator, it is something like "/".
>> Aliasing is
>> avoided as
>> you do not rely on field separators to parse the name; you
>> use the
>> "offset
>> TLV " to do that.
>>
>> So now, it may be an aesthetic question but:
>>
>> if you do not need the entire hierarchal structure (suppose
>> you only
>> want
>> the first x components) you can directly have it using the
>> offsets.
>> With the
>> Nested TLV structure you have to iteratively parse the first
>> x-1
>> components.
>> With the offset structure you cane directly access to the
>> firs x
>> components.
>>
>> Max
>>
>>
>> -- Mark
>>
>> On 9/17/14 6:02 AM, Massimo Gallo wrote:
>>
>> The why is simple:
>>
>> You use a lot of "generic component type" and very few
>> "specific
>> component type". You are imposing types for every component
>> in order
>> to
>> handle few exceptions (segmentation, etc..). You create a
>> rule
>> (specify
>> the component's type ) to handle exceptions!
>>
>> I would prefer not to have typed components. Instead I would
>> prefer
>> to
>> have the name as simple sequence bytes with a field
>> separator. Then,
>> outside the name, if you have some components that could be
>> used at
>> network layer (e.g. a TLV field), you simply need something
>> that
>> indicates which is the offset allowing you to retrieve the
>> version,
>> segment, etc in the name...
>>
>>
>> Max
>>
>>
>>
>>
>>
>> On 16/09/2014 20:33, Mark Stapp wrote:
>>
>> On 9/16/14 10:29 AM, Massimo Gallo wrote:
>>
>> I think we agree on the small number of "component types".
>> However, if you have a small number of types, you will end
>> up with
>> names
>> containing many generic components types and few specific
>> components
>> types. Due to the fact that the component type specification
>> is an
>> exception in the name, I would prefer something that specify
>> component's
>> type only when needed (something like UTF8 conventions but
>> that
>> applications MUST use).
>>
>> so ... I can't quite follow that. the thread has had some
>> explanation
>> about why the UTF8 requirement has problems (with aliasing,
>> e.g.)
>> and
>> there's been email trying to explain that applications don't
>> have to
>> use types if they don't need to. your email sounds like "I
>> prefer
>> the
>> UTF8 convention", but it doesn't say why you have that
>> preference in
>> the face of the points about the problems. can you say why
>> it is
>> that
>> you express a preference for the "convention" with problems ?
>>
>> Thanks,
>> Mark
>>
>> .
>>
>> _______________________________________________
>> Ndn-interest mailing list
>> Ndn-interest at lists.cs.ucla.edu
>> http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest
>>
>> _______________________________________________
>> Ndn-interest mailing list
>> Ndn-interest at lists.cs.ucla.edu
>> http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest
>>
>> _______________________________________________
>> Ndn-interest mailing list
>> Ndn-interest at lists.cs.ucla.edu
>> http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest
>>
>>
>>
>> _______________________________________________
>> Ndn-interest mailing list
>> Ndn-interest at lists.cs.ucla.edu
>> http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest
>>
>>
>>
>> _______________________________________________
>> Ndn-interest mailing list
>> Ndn-interest at lists.cs.ucla.edu
>> http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest
>>
>>
>> _______________________________________________
>> Ndn-interest mailing list
>> Ndn-interest at lists.cs.ucla.edu
>> http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest
>>
>>
>> _______________________________________________
>> Ndn-interest mailing list
>> Ndn-interest at lists.cs.ucla.edu
>> http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest
>>
>>
>>
>> _______________________________________________
>> Ndn-interest mailing list
>> Ndn-interest at lists.cs.ucla.edu
>> http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest
>>
>>
>>
>>
>>
>> _______________________________________________
>> Ndn-interest mailing list
>> Ndn-interest at lists.cs.ucla.edu
>> http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest
>>
>>
>>
>> _______________________________________________
>> Ndn-interest mailing list
>> Ndn-interest at lists.cs.ucla.edu
>> http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest
>>
>>