Personal KM: How do you store journal articles?

I’m now squirreling away the PDFs of papers that I download for iSchool and once I got into the low double digits I realized that I really need a system. I haven’t found a good one yet. Anybody else?

For starters I’m trying to rename them to something logical (Norman-1993-ThingsThatMakeUsSmart.pdf, Markman-2001-Thinking.pdf).

But beyond that I need to keep them organized (1) in an inbox of assignments to be read and (2) with the courses for which they were assigned. I could also use a way to group them (3) by papers I’m writing. I’ve been bitten by the recent fad for (4) tags, so that would be nice. Since book chapters are often scanned without a title page an ideal system would (5) attach them to whatever I’ve got that resembles a standard bibliographic citation. And (6) any system I come up with should scale up and be portable for longer-term use when the semester or indeed iSchool are things of the past.

I could probably do all of the above except for (5) with aliases: put everything in a single folder and drag aliases into separate folders for each course, paper I’m writing, and topical tag. But that sounds pretty klunky, in part because keeping a lot of Finder windows open is problematic on a 12″ laptop.

So what are other people using?

  1. Prentiss, have you used or checked out CiteSeer yet? I’m guessing that it will have a big chunk of the papers you will be reading either in it or cited in it, it offers a handy set of naming conventions for files so you don’t have to make them up yourself, and it has bibliographic citations (in LaTeX and plain text) ready and raring to be cut and pasted into whatever paper you need to write.

  2. I was curious whether spotlight in OSX 10.4 would support something like that. I checked out the “tech preview” at Apple’s website, and it says that one form of metadata with “general” support (I think that means all files) is “keyword.” So perhaps we’ll be able to add keywords under the “get info” view or something.

  3. Indeed, Prentiss, you can manage your PDFs in iTunes. Just do “File”, “Add to Library”, give it the name of a PDF and Bob’s your uncle. You can make playlists of PDFs, assign metadata like author and “album” and “genre”, etc etc. When you select a PDF it displays in the previewer.

    Your profs, on the other hand, seem to be stuck in a world where the intellectual property they have labored so hard to produce is handed over to publishers. Stick a bug in their ear that other fields – high energy physics, large parts of computer science, medicine, and many others – have embraced open publication of research results. arXiv, CiteSeer, HubMed, etc are examples of these systems in action. (Or perhaps we all wait until Google scans in everything.)

  4. It’s worth mentioning Google Scholar. It doesn’t cast quite as broad a net as CiteSeer, but it’s improving. (And it’s about a million times faster than CiteSeer.)

    Of course, this doesn’t actually help you at all. I have the same problem—I want to retain the PDFs and PSes of my seminar papers, because (a) I need to classify them by course, topic, etc. and CiteSeer doesn’t know anything about my course schedule; (b) I need to save them locally b/c many of the download URLs require my department’s subscription (say, to the ACM) and I need to read them at home (where my IP address doesn’t scream “Paying Member”).

    The idea of managing PDFs in iTunes is at once brilliant and nauseating: yes, that’s a pretty good interface for representing and interacting with a publication library; no, no, no, get out of my music, you filthy research paper!

  5. Ed, I don’t think it’s quite as simple as my profs being stuck in the old paradigm. They’re quite aware of the “open access” movement in academic publishing. Rather, I think they don’t want to limit the sources they use to those which have moved to the new paradigm. Since we’re at a university with a major research library that pays good money for a lot of subscriptions, I guess they figure we might as well get the use out of them.

    Yeah, Dan, Google Scholar rocks. It may be more limited in its CS holdings than CiteSeer but I believe it’s broader. And because it’s Google, a lot of for-profit publishers are happy to have it index their stuff, which you can then either buy or read for free if your university happens to subscribe. (At least that’s how they’ve got it wired up to work at UT.) So Google Scholar bridges the proprietary and open-access paradigms.

    And yet… I’m getting increasingly worried about Google’s own quasi-monopoly. Right now we love Google because it’s so good at what it does and hasn’t piled mountains of cruft onto its user interface. But once the Google Library project gets big and they’re set up to get a nickel every time we buy one of the books they’ve scanned, I don’t see how they’ll be able to resist pushing the books to the top that we’ll be more likely to buy. And then there are the political pressures which will only get stronger as people realize the importance Google has assumed.

  6. I’d say definitely start with DevonThink (though it isn’t free) for making a searchable, categorised database of all your PDFs. Use BibDesk for storing a bibliography database, which has (or will soon have) the added benefit of being able to automatically keep your folder of PDFs organised (or you could just name them authoryear in one big folder). Then for the tagging, try CiteULike to store your collection of abstracts: this can also be exported as BibTeX straight into your bibliography manager.

    Schubert’s PDF plugin for Safari is handy too, as it lets you open files in your choice of application and puts the original URL of a downloaded PDF into the comments field.

  7. This post and comments have been SO helpful. Ages ago I used a program called Papyrus (now defunct, but it had a great user manual and some nifty features like praising your hard work if you were still at it after midnight). I let my biblio lapse and now all my files are a hodge podge of aging paper and indecipherable .pdfs. First–WHY are we stuck with .pdf when other e-reader formats would be SO MUCH HANDIER??? Second–I typically just use files/folders in “My Documents.” But all these other ways sound superior. (Gotta get a Mac so I can use DevonThink).

  8. Beth, what e-reader formats would you prefer over PDFs?

    Yes, I’m concerned about the long-term viability of any single format, especially a proprietary one. But for what it’s intended for — to represent paper documents as faithfully as possible, preserving formatting and graphics while also providing access to the text — PDF seems pretty good. Better than .ps or .doc files, that’s for sure!

  9. Sure, if you need to preserve *all* the formatting in the original, .pdf works. (Setting aside the concerns about proprietary formats.)

    But most of the time, there is no benefit to the formatting in the original. For example, it doesn’t matter what page a sentence appears on when I have a search function that enables me to find any passage easily.

    More important, trying to preserve all print formatting interferes with the readability of the electronic document. Other e-reader formats (e.g., Mobipocket) allow more reader control and increased ability to interact with text. For example, I can insert bookmarks, annotations, etc. in a Mobipocket document, but I can’t do that in .pdf. And Mobipocket is far quicker to load and move from page to page. And Mobipocket allows me to adjust font size according to my vision needs, it scrolls through pages (if I so choose) at a rate that I set, and I don’t have to mouse back and forth to see the whole page because the content adjusts to my equipment. Some e-readers have a “read aloud” function too, I think.

    I’d prefer that there was a standard open format for e-documents (such as that promoted by, but until that happens, we’d still be better served if publishers offered e-reader alternatives to .pdf, since the alternatives are much more usable.

