[Ndn-interest] A synopsis of NDN

César A. Bernardini mesarpe at gmail.com
Mon Apr 2 02:55:09 PDT 2018


Dear Michael,

Thank you very much for such a list. I guess it is a super idea to bring
all the papers together. Especially, if you make it available in github and
we can all collaborate. By adding our papers to the reference.

Regarding content, this is a little bit more complicated but I was thinking
that it could be great, if we could try to classify as many papers as we
can in a collaborative way.

Personally, I have been working with caching and security. So, I could give
you a hand in these topics.

Cheers,

2018-04-02 9:07 GMT+02:00 Klaus Schneider <klaus at cs.arizona.edu>:

> Dear Michael,
>
> Thanks for the elaborate reply.
>
> I can see your point about the complexity of the task that you describe.
> Certainly it is very hard to create a script that returns perfect results,
> even if the task is just to remove all duplicates.
>
> I think the ability to sort papers by relevance (like in Google Scholar),
> reduces the need to be perfect to some degree. Sure, there are many
> duplicates, non-English papers, and papers of very low quality. But they're
> all at the bottom of the list, so they don't hurt too much.
>
> On the other hand, I can also see the value of your list, which is less
> complete, but more correct (some entries may be missing, but for any
> existing entry there is a higher probability that it actually belongs on
> the list).
>
>
> I guess the larger point is that it's hard for any one person to compete
> with Google ;)
>
> Best regards,
> Klaus
>
>
>
>
> On 01/04/18 23:36, Michael Hucka wrote:
>
>> On Sun, 1 Apr 2018 20:06:17 -0700, Klaus Schneider wrote:
>>
>>> I think your time might be more well spent to write a script or
>>> search filter to query these sites (GS, IEEE, ...) and then remove
>>> duplicates, non-English papers, etc., rather than trying to gather
>>> all papers by hand.
>>>
>>> This would bring a number of benefits, such as:
>>>
>>> - Being always up to date
>>> - Sorting by date and relevance (citation count)
>>> - Listing related work (cited by)
>>> - Including a link to the pdf files
>>>
>>> It would also automatically include the techreports that Spyros
>>> mentioned (the NDN techreport site is indexed by Google Scholar).
>>>
>>
>> Have you actually tried to do something like this?
>>
>> Speaking as someone who's been doing research and writing software since
>> the 1980's, I feel I can say with reasonably high confidence that
>> developing a working scheme like this would require a significant amount of
>> time and effort.  (Either that, or we have different ideas in our minds of
>> what constitutes a good implementation.)
>>
>> The problem is not scraping a lot of references from somewhere.  Here's
>> an example of a problem: detecting when Google Scholar's database
>> incorrectly says something is a journal paper when it is actually a
>> conference paper.  A human or a weak AI has to look at the paper to figure
>> it out.  I did that myself, in a lot of cases, and then made hand
>> corrections.  Now this leads to a new requirement for an automation scheme:
>> detect that an existing-but-modified entry is the same as something
>> returned by Google Scholar, so that the next time you run the workflow, it
>> doesn't automatically add it again thinking it's a different entry.  Yes,
>> of course, the problem can be solved.  But this is just one example. Each
>> little new problem adds time to the implementation and its debugging, as
>> well as complexity to the overall system, effort to produce documentation,
>> and software to maintain over time.
>>
>> Linking to the PDFs introduces another wrinkle.  Although I can't share
>> the PDFs publicly because of copyright reasons, I actually have them for
>> probably 99% of the references.   The bibliography I put online has DOIs
>> that link to a lot of the publications directly, so people can get to the
>> PDFs, but they will need to have access to the publication due to
>> copyrights -- it was the best compromise I could come up with, even though
>> I wish I could do more.  Now, the DOIs link to the publisher's page.  To
>> link to the PDFs directly is another level of complexity altogether (there
>> is a lot of variation in journal page formats).  Only in some limited cases
>> like the NDN tech reports or perhaps the IEEE pages could you easily and
>> regularly link to the PDFs.
>>
>> I use Paperpile, which has built-in recognizers for Google Scholar and
>> many publishers' sites.  It can actually import PDFs automatically, extract
>> metadata from the PDF, and query Google Scholar for the bib entry.  It's
>> freaking *amazing*, and makes this kind of work go very quickly.  I don't
>> know how much effort was required to implement its capabilities, but it is
>> clearly not a weekend script.  And even as good as it is, it's not perfect
>> -- it doesn't always work.  That gives us an idea of what it takes to find
>> PDFs.
>>
>> I don't disagree that it would be nice and useful to have the automation
>> you describe.  Who wouldn't like that?  My point here is that unless you
>> are aware of new technology I'm overlooking (which is entirely possible!),
>> I think doing this would be a more difficult engineering problem that it
>> may seem.  And even if it were implemented, that wouldn't be the end of it:
>> software has to be maintained over time, and adapted when service providers
>> change their API or data format. (Which most definitely happens; in fact,
>> last year, Google Scholar changed its page layout, and this completely
>> broke another bibliography system I used to use called Sente.)
>>
>> My conclusion is that developing this automation would not be time better
>> spent for me, and in any case, I have too much on my plate already to even
>> start.  But, perhaps one of the NDN or CCN teams could undertake the
>> development of something like this as an activity.
>>
>> Finally, I apologize for the length of this message, and I think further
>> discussion of this matter would be off-topic for this mailing list.  If
>> people really are interested in continuing discussions, I could throw
>> together a Google group for it.
>>
>> Best regards,
>> MH
>> --
>> Mike Hucka, Ph.D. -- mhucka at caltech.edu -- http://www.cds.caltech.edu/~mh
>> ucka
>> Dept. of Computing + Mathematical Sciences, California Institute of
>> Technology
>>
>> _______________________________________________
> Ndn-interest mailing list
> Ndn-interest at lists.cs.ucla.edu
> http://www.lists.cs.ucla.edu/mailman/listinfo/ndn-interest
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.cs.ucla.edu/pipermail/ndn-interest/attachments/20180402/f26d97b4/attachment.html>


More information about the Ndn-interest mailing list