[Ndn-interest] NDN protocol principles: no privacy?

Mon Mar 14 22:22:43 PDT 2016

On Mar 14, 2016, at 9:53 PM, Tai-Lin Chu <tailinchu at gmail.com<mailto:tailinchu at gmail.com>> wrote:

Immutable = one data packet cannot be changed after publication.

No, it is not “a single publisher will never use the same name for different contents” (I believe this is "uniquely named") nor “there is some cryptographic function possible on a packet such that one can detect if it changes”.

Immutability is advisory here. If I start to violate immutablity in a distributed system where everything is assumed to be immutable, I am doomed to get incorrect result (I guess this sounds mandatory). I think immutability is a good choice overall, and it is better to state it as a principle so that nobody will use ndn the wrong way.

Well, as you seemed to notice, if your system relies on an advisory feature for correctness, that’s a tenuous position.  That’s probably not a correct statement for NDN, as an end node should be able to detect (hopefully) if a Data object is what it expected via signature or something and try again with an exclusion.  however, you might not have liveness.

So, my point here, is we talk about immutable objects but in truth we do not have them unless we use a hash-based name.  We can define that a correctly operating publisher will never re-use a name and it cryptographically binds every Data object, but that is not immutable objects.  That’s a protocol that generates unique objects.  Unique is not immutable. A publisher could have a failure that causes its sequence number to reset or a memory error that causes the version number to flip a bit, etc.

One can build a discovery protocol over exact name matching.

Is the table of contents immutable? I think your approach is fine and efficient but it is rather hard to manage in a large distributed system if the table is not immutable.

If the client sends a nonced name the publisher can return a link (in the ccnx sense, not the routing hints) plus uniquely named data of the table of contents appended for efficiency (to avoid a second round trip).

I would think that doing cursors over table of contents rows is much more efficient in a large distributed system than returning potentially large data objects and doing search.  If I have 100M versions of a name and I want to scan the data for some attribute, I think its much easier to fetch a partial tables of contents and pick the exact ones I want to retrieve than to iterate over the set using some name heuristic.

This method also allows one to do discovery based on other criteria (attributes), such as signer keyids or other TLV fields, rather than just name elements.  I could, for example, do a */superman.mov/* search for all names with superman.mov as a name component (if that type of query was allowed by the protocol).  Or I could ask for only data objects that were manifests.

Because this matching is done via an explicit discovery protocol, not as part of the forwarder, I have a lot of flexibility in the attributes I can query and the access control I can enforce.

It is a design choice.  NDN (from CCNx 0.x) chose to use mandatory in-network discovery base solely on name components, exclusions, and tree traversal.  That’s ok, and one could claim that is sufficient.  CCNx 1.0 chose to use explicit discovery protocols.  My main point in asking about this is to point out that in-network mandatory discovery at the forwarder level is not a necessary condition.

On Mar 14, 2016, at 9:01 PM, <Marc.Mosko at parc.com<mailto:Marc.Mosko at parc.com>> <Marc.Mosko at parc.com<mailto:Marc.Mosko at parc.com>> wrote:

On Mar 14, 2016, at 8:44 PM, Tai-Lin Chu <tailinchu at gmail.com<mailto:tailinchu at gmail.com>> wrote:

sure - I don't want to expose names that identify me, or expose my communication activities. given that, the "network" doesn't have the job of finding things for me by partial names - I only want to expose the details of my communication to a service that I have authenticated, and only when those details are encrypted. the "names" visible to the network in that sort of world just get the packets moving - and the only LPM needed is LPM in the FIB to get me to one or more instances of a service.

Immutability is related to in-network discovery with LPM.  If all packets are immutable, and there is no in-network discovery, ndn must rely on some other protocol that cannot not build on top of ndn for discovery (we should all agree that randomly guessing a version number or a certain name is not going to work well as “discovery”). This devalues ndn as an “universal" protocol.

Could you please define immutable?  Do you mean that a single publisher will never use the same name for different contents?  Is that mandatory or enforceable?  Or do you mean that there is some cryptographic function possible on a packet such that one can detect if it changes?  Are those cryptographic primitives mandatory in each packet?

I disagree that it is a necessary condition that one have name suffix completion matching of a data object to an interest to facilitate discovery.  One can build a discovery protocol over exact name matching.  I usually build these where the cache returns a chunked table of contents listing possible matches instead of the CCNx 0.x / NDN approach of having to return a (potentially very large) data object and walk a tree which is really only efficient if you expect what you want to be left-most or right-most child and not require iteration.

On Mar 14, 2016, at 12:10 PM, Mark Stapp <mjs at cisco.com<mailto:mjs at cisco.com>> wrote:

interesting -

On 3/14/16 11:27 AM, Burke, Jeff wrote:

[...]
RFC 6973 takes a nice approach, for example, by offering
definitions of some technical properties and mechanisms, but not trying
to formulate an overall definition of "privacy".

So I can try to understand your point here - do you agree with the
authors that the primary privacy concerns are those of individuals? (Or,
more generally, are corporations people here for this discussion - a
more generic "data owner"?)

hmm - well, I don't think corporations are people, in the citizens united sense, but I think there's lots of commercial communication that needs to have the best possible protection, whether it's B2C or B2B?

The editors there say
that the body of the document, the discussion of the tradeoffs and
alternatives, is the best way they could come up with to approach that
abstraction. in practical terms, as you know well I think there's been
an over-reliance on opportunistic caching in ICN generally, and as a
result observability and correlation are defined to be positive
properties of ICN communication rather than harmful ones.

Would I be correct to parse your concerns into two pieces that may
have different implications:

- Confidentiality of request (e.g., the consumer side)
- Confidentiality of publication (e.g., the publisher side)

I think I have a mental image of "confidential request" - where an observer cannot see much beyond the routeable prefix needed to reach an instance of the service I want to communicate with. I'm not sure what "confidential publication" means, though? I think I want the replies to my requests to be encrypted with ephemeral, forward-secure key material, I don't want the names in the replies to expose any more than the names in the requests, and I want to be able to authenticate the service before I expose anything about my own identity or intentions. is that what you meant by "the publisher side"?

[...]

most of these six "principles" sounded like "mechanisms" to me - the
list felt like the end of a discussion about alternatives and the best
ways to implement an architecture, rather than the start of one. it
sounded like "we're tired of questions about LPM in the PIT, so we're
going to stop calling that a possible mechanism and start calling it an
inevitable, immutable, unquestionable 'principle'".

Well, to take LPM for an example - it's actually not mentioned in
the
principle doc that Alex sent. The principle I suspect that you are
referring to is:

[5] In-Network Name Discovery: Interests should be able use
incomplete
names to retrieve data packets.
A consumer may not know the complete network-level name for data, as
some parts of the name cannot be guessed, computed, or inferred
beforehand. Once initial data is received, naming conventions can help
determine complete names of other related data:

* majority of interests will carry complete names

* in-network name discovery expected to be used to bootstrap
communication)

Can you explain your objection in these terms?

sure - I don't want to expose names that identify me, or expose my communication activities. given that, the "network" doesn't have the job of finding things for me by partial names - I only want to expose the details of my communication to a service that I have authenticated, and only when those details are encrypted. the "names" visible to the network in that sort of world just get the packets moving - and the only LPM needed is LPM in the FIB to get me to one or more instances of a service.

Thanks,
Mark
_______________________________________________
Ndn-interest mailing list
Ndn-interest at lists.cs.ucla.edu<mailto:Ndn-interest at lists.cs.ucla.edu>
http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest

_______________________________________________
Ndn-interest mailing list
Ndn-interest at lists.cs.ucla.edu<mailto:Ndn-interest at lists.cs.ucla.edu>
http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.cs.ucla.edu/pipermail/ndn-interest/attachments/20160315/a27826de/attachment.html>