[Ndn-interest] Discovery (was: any comments on naming convention?)

Mon Sep 22 03:00:49 PDT 2014

Marc's email actually went to the same thread. This should start a new one.

Feel free to reply to Marc here.

- Felix

On 22/Sep/14 11:33, Marc.Mosko at parc.com wrote:
> I received some feedback that people would like to know more about discovery and what we have done in ccnx 1.0, so I am starting a new thread for it.
>
> As I mentioned, we are not ready to publish our discovery protocols and I don’t want to go off half way with something we are still working on, but I can talk about what I think discovery is and should do.  Discovery is a very important topic and I don’t think research is anywhere near done on the topic.
>
> First, I think we need a clear definition of what services discover offers.  I would say it should do, at least, “discover greatest” and “discover all” with the obvious variations based on sort order and range, for some given prefix.  It should also support discovery scoped by one (or possibly more) publisher keys.
>
> What does “discover greatest” mean?  One could do something similar to ccnx 0.x and ndn, where its based on a canonical sort order of name components and one could ask for the greatest name after a given prefix.  Or, one could do something specific to a versioning protocol or creation time, etc.
>
> What does “discover all” mean?  First, I think we should recognize that some data sets might be very, very large.  Like millions or billions of possible results (content objects).  The discovery protocol should be able to discover it all, efficiently (if not optimally).
>
> I also think there is no one discovery protocol.  Some applications may want strict ACID-style discovery (i.e. only see “completed” or “whole” results, such as where all segments are available) some might want eventually consistent discovery, some might take best-effort discovery.  Some discovery protocols may require authentication and some may be open to the world.
>
> I think the discovery process should be separate from the retrieval process.  If one is publishing, say, 64KB objects or even 4GB objects (or even larger objects!), one does not want to have to fetch each object to discover it.
>
> All this leads me to think we need discovery protocols that allow us to talk about content without actually transferring the content.
>
> The discovery protocol should be able to handle multiple objects with the same name — i.e. two publishers overwrite each other or even a single publisher publishes two objects with the same name.
>
> Discovery is also closely related to how routing works and forwarding strategy.  Does an Interest flood all replicas?  Does it only go anycast style?  How large can an Interest be, and what’s the performance tradeoff for large interests (i.e. dropping fragments)?  Is there an in-network control protocol to NAK or are all NAKs end-to-end?  How does a discovery process terminate (i.e. when do you know you’re done)?
>
> Marc