Archive for the 'information science' category

Around the Web: An altmetrics reading list

I'm doing a presentation at this week's Ontario Library Association Super Conference on a case study of my Canadian War on Science work from an altmetrics perspective. In other words, looking at non-traditional ways of evaluating the scholarly and "real world" impact of a piece of research. Of course, in this case, the research output under examination is itself kind of non-traditional, but that just makes it more fun.

The Canadian War on Science post I'm using as the case study is here.

Here's the session description:

802F Altmetrics in Action: Documenting Cuts to Federal Government Science: An Altmetrics Case Study

The gold standard for measuring scholarly impact is journal article citations. In the online environment we can expand both the conception of scholarly output and how we measure their impact. Blog posts, downloads, page views, comments on blogs, Twitter or Reddit or Stumpleupon mentions, Facebook likes, Television, radio or newspaper interviews, online engagement from political leaders, speaking invitations: all are non-traditional measures of scholarly impact. This session will use a case study to explore the pros & cons of the new Altmetrics movement, taking a blog post documenting recent cuts in federal government science and analysing the various kinds of impact it has had beyond academia.

  1. Understand what Altmetrics are
  2. Understand what some pros and cons are of using Altmetrics to measure research impact
  3. Ways that academic librarians can use altmetrics to engage their campus communities.

Not surprisingly, I've been reading up on altmetrics and associated issues. Since it's something I already know a fair bit about, my reading hasn't perhaps been as systematic as it might be...but I still though it would be broadly helpful to share some of what I've been exploring.

Enjoy!

Some companies & organizations involved:

And please do feel free to add any relevant items that I've missed in the comments.

One response so far

Welcome to Information Culture, the latest blog at Scientific Amerincan

I'd like to extend a huge science librarian blogosphere welcome to Information Culture, the newest blog over at Scientific American Blogs!

This past Sunday evening I got a cryptic DM from a certain Bora Zivkovic letting me know that I should watch the SciAm blog site first thing Monday morning. I was busy that morning but as soon as I got our of my meeting I rushed to Twitter and the Internet and lo! and behold!

Information Culture: Thoughts and analysis related to science information, data, publication and culture.

I'm always happy to see librarians invading faculty and researcher blogs networks and this is no exception.

What's even happier is that one of the bloggers at the new site is Bonnie Swoger, long-time blogger at Undergraduate Science Librarian. Bonnie is a super blogger and a terrific colleague who I'm always glad to see at Science Online. I'm sort of wondering what's taken so long for a blogging network to snap her up and I guess it's not surprising that Bora's the one to finally get it done.

Joining Bonnie is an equally wonderful but new-to-me blogger, Hadas Shema. Hadas is an Info Sci grad student at Bar-Ilan University in Israel and formerly blogged at Science blogging in theory and practice.

Here's what Bora has to say in his Introductory post: Welcome Information Culture - the newest blog at #SciAmBlogs

How to do an efficient search? How can a librarian help you find obscure references? What is this "Open Access" thing all about? Why is there a gender gap among Wikipedia editors? How do science bloggers link to each other? Can tweeting a link to a paper predict its future citations? How to track down an un-linked paper mentioned in a media article? What is going on with eTextbooks?

And from the new blog itself, a taste of the first three posts:

Introduction post - Hadas Shema

Two questions I get asked now and then are A. "What do you study?" And B. "What is it good for? (as in "Why should my tax money fund you?"). Now that I have an excellent platform like this SciAm blog, I might as well take advantage of it to answer at least the first question (I'll let you decide if it's worth the taxpayer's money).

I study Information or Library Science, and my sub-field is what used to be called Bibliometrics, "the application of mathematical and statistical methods to books and other media of communication," (Pritchard, 1969). The term was invented back in 69′, when official scientific communication involved dead trees. The Russian version, "Scientometrics" was coined around that time as well. Today we have a variety of other terms, perhaps more appropriate for the net age: Cybermetrics, Informetrics, Webometrics and even Altmetrics. But for now, let's stick with Bibliometrics.

Bibliometricians measure, analyze and record scientific discourse. We want to learn what impact scientific articles, journals, and even individual scientists have on the world. Until recently "the world" meant "other articles, journals and individual scientists" because it was next to impossible to research the way scientific discourse affect the rest of the world, or even how scientists affect it when they're not in "official" capacity (publishing a paper or speaking at a conference). Now Bibliometricians not only need a new name, but new indices. That's what I (and plenty of other people) work on. We ask what scientists are doing on the Web, how and why they're doing it and the most important thing - can we use it to evaluate the impact of their work.

You have to share (by Bonnie Swoger)

Understanding how scientists share their results is my job. I am a science librarian.

I work with scientists at my college to make sure that they have access to the information they need to do their work. I teach undergraduates - novice scientists - how the scientific literature works: What kinds of information are available? Where can you find what you need? How can you use the different types of information? I work with researchers to help them understand new developments in scholarly communication: What is a DOI and how can it make your research just a bit easier? Are you allowed to post a copy of your recent article on your website and what are the advantages if you do?

And as I work with students and faculty at my institution, this blog will be a place for me to share some of these concepts with you. I'll share tips to help you find information faster, explain basic concepts related to the publication of scientific results and try to figure out how recent scholarly comunication news

*snip*

It's hard to stand on the shoulders of giants if the giants are hiding under the bed.

Understanding the Journal Impact Factor - Part One (by Hadas Shema)

The journals in which scientists publish can make or break their career. A scientist must publish in "leading" journals, with high Journal Impact Factor (JIF), (you can see it presented proudly on high-impact journals' websites). The JIF has gone popular partly because it gives an "objective" measure of a journal's quality and partly because it's a neat little number which is relatively easy to understand. It's widely used by academic librarians, authors, readers and promotion committees.

Raw citation counts emerged at the 20′s of the previous century and were used mainly by science librarians who wanted to save money and shelf space by discovering which journals make the best investment in each field. This method had a modest success, but it didn't gain much momentum until the sixties. That could be because said librarians had to count citations by hand.

Run on over and say Hi to Bonnie and Hadas!

3 responses so far

Books I'd like to read

For your reading and collection development pleasure!

Planned Obsolescence: Publishing, Technology, and the Future of the Academy by Kathleen Fitzpatrick

Academic institutions are facing a crisis in scholarly publishing at multiple levels: presses are stressed as never before, library budgets are squeezed, faculty are having difficulty publishing their work, and promotion and tenure committees are facing a range of new ways of working without a clear sense of how to understand and evaluate them. Planned Obsolescence is both a provocation to think more broadly about the academy's future and an argument for re-conceiving that future in more communally-oriented ways. Facing these issues head-on, Kathleen Fitzpatrick focuses on the technological changes especially greater utilization of internet publication technologies, including digital archives, social networking tools, and multimedia necessary to allow academic publishing to thrive into the future. But she goes further, insisting that the key issues that must be addressed are social and institutional in origin. Confronting a change-averse academy, she insists that before we can successfully change the systems through which we disseminate research, scholars must re-evaluate their ways of working how they research, write, and review while administrators must reconsider the purposes of publishing and the role it plays within the university. Springing from original research as well as Fitzpatrick's own hands-on experiments in new modes of scholarly communication through MediaCommons, the digital scholarly network she co-founded, Planned Obsolescence explores all of these aspects of scholarly work, as well as issues surrounding the preservation of digital scholarship and the place of publishing within the structure of the contemporary university.

Program or Be Programmed: Ten Commands for a Digital Age by Douglas Rushkoff

The debate over whether the Net is good or bad for us fills the airwaves and the blogosphere. But for all the heat of claim and counter-claim, the argument is essentially beside the point: It's here; it's everywhere. The real question is, do we direct technology, or do we let ourselves be directed by it and those who have mastered it? "Choose the former," writes Rushkoff, "and you gain access to the control panel of civilization. Choose the latter, and it could be the last real choice you get to make."

In ten chapters, composed of ten "commands" accompanied by original illustrations from comic artist Leland Purvis, Rushkoff provides cyber enthusiasts and technophobes alike with the guidelines to navigate this new universe.

In this spirited, accessible poetics of new media, Rushkoff picks up where Marshall McLuhan left off, helping readers come to recognize programming as the new literacy of the digital age--and as a template through which to see beyond social conventions and power structures that have vexed us for centuries. This is a friendly little book with a big and actionable message.

The Digital Scholar: How Technology is Changing Academic Practice by Martin Weller

While industries such as music, newspapers, film and publishing have seen radical changes in their business models and practices as a direct result of new technologies, higher education has so far resisted the wholesale changes we have seen elsewhere. However, a gradual and fundamental shift in the practice of academics is taking place. Every aspect of scholarly practice is seeing changes effected by the adoption and possibilities of new technologies. This book will explore these changes, their implications for higher education, the possibilities for new forms of scholarly practice and what lessons can be drawn from other sectors.

Unlocking the Gates: How and Why Leading Universities Are Opening Up Access to Their Courses by Taylor Walsh

Over the past decade, a small revolution has taken place at some of the world's leading universities, as they have started to provide free access to undergraduate course materials--including syllabi, assignments, and lectures--to anyone with an Internet connection. Yale offers high-quality audio and video recordings of a careful selection of popular lectures, MIT supplies digital materials for nearly all of its courses, Carnegie Mellon boasts a purpose-built interactive learning environment, and some of the most selective universities in India have created a vast body of online content in order to reach more of the country's exploding student population. Although they don't offer online credit or degrees, efforts like these are beginning to open up elite institutions--and may foreshadow significant changes in the way all universities approach teaching and learning. Unlocking the Gates is one of the first books to examine this important development.

The Innovative University: Changing the DNA of Higher Education from the Inside Out by Clayton Christensen

The language of crisis is nothing new in higher education--for years critics have raised alarms about rising tuition, compromised access, out of control costs, and a host of other issues. Yet, though those issues are still part of the current crisis, it is not the same as past ones. For the first time, disruptive technologies are at work in higher education. For most of their histories, traditional universities and colleges have had no serious competition except from institutions with similar operating models. Now, though, there are disruptive competitors offering online degrees. Many of these institutions operate as for-profit entities, emphasizing marketable degrees for working adults. Traditional colleges and universities have valuable qualities and capacities that can offset those disruptors' advantages--but not for everyone who aspires to higher education, and not without real innovation. How can institutions of higher education think constructively and creatively about their response to impending disruption?

Introduction to Information Science and Technology edited by Charles H. Davis and Debora Shaw

This guide to information science and technology -- the product of a unique scholarly collaboration --presents a clear, concise, and approachable account of the fundamental issues, with appropriate historical background and theoretical grounding. Topics covered include information needs, seeking, and use; representation and organization of information; computers and networks; structured information systems; information systems applications; users perspectives in information systems; social informatics; communication using information technologies; information policy; and the information profession.

I have a bit of a backlog of these, so there'll probably be another post pretty soon, maybe even this week.

No responses yet

Thomson Reuters, Nobel Prize predictions and correlation vs. causation

It's time for my annual post taking issue with Thomson Reuters (TR) Nobel Prize predictions.

(2002, 2006, 2007a, 2007b, 2008, 2009, 2010)

Because, yes, they're at it again.

Can the winners of the Nobel Prize be correctly predicted? Since 1989, Thomson Reuters has developed a list of likely winners in medicine, chemistry, physics, and economics. Those chosen are named Thomson Reuters Citation Laureates -- researchers likely to be in contention for Nobel honors based on the citation impact of their published research.

Reading this you would reasonably assume that TR thinks there is at least a little bit of a causal link between citation counts, or as they call it "citation impact," and winning the Nobel Prize. Sure, they say "likely to be in contention" as a way of softening the link they'd like to draw. But they make such a big deal of the whole thing that it's hard not to imagine that they see drawing a strong link between the two as a great way to promote their citation reporting and analysis products.

However, in their Process Essay, they make make explicit that they understand that the link between citation counts and true scientific impact is only correlation:

Numerous studies in the past three decades have shown a strong correlation between citations in the literature and peer esteem, often reflected in professional awards, such as the Nobel Prize. This should cause no surprise. Citations have been likened to repayments of intellectual debts, so persons who have accumulated such credits from their peers are often those whom these peers nominate for prizes and other honors.

*snip*

It is clear that the choices of the Nobel Committees are more complex than simply identifying highly cited or most-cited scientists.

Last year there was an article Globe and Mail covering the TR predictions:

"We choose our citation laureates by assessing citation counts and the number of high-impact papers while identifying discoveries or themes that may be considered worthy of recognition by the Nobel committee," said David Pendlebury of Thomson Reuters.

"A strong correlation exists between citations in literature and peer esteem. Professional awards, like the Nobel Prize, are a reflection of this peer esteem."

And Pendlebury again from a comment in my 2009 post:

Many in our lists rank much higher than the top .1%

The reason others suggested the same names we have, in blogs and news stories, is that they have studied our selections in this and past years.

By the way, Blackburn, Greider and Szostak won this morning, and were picked by us this year. Through citation analysis we focused on Blackburn as long ago as 1993: http://archive.sciencewatch.com/interviews/elizabeth_blackburn1.htm

Note that this was before the receipt of the Gairdner Award (1998) and Lasker (2006).

Again, not causality, just a strong correlation between citations and peer esteem.

We don't disagree that Nobel Prizes are not chosen on the basis of citation counts.

From Toronto's Dr. James Till, one of the citation laureates that TR chose last year(from the G&M article):

Dr. Till, reached Tuesday at his Toronto home, said he was told by Thomson Reuters that he and Dr. McCulloch are among the top picks for a Nobel. But Dr. Till, known for his scientific rigour, was reluctant to say much about the prediction.

"I'm skeptical," he said. "This is just speculation based on data that Thomson Reuters gathers, citation data."

"This kind of speculation is not something I'd like to comment on."

TR has certainly changed their tune over the last number of years in the way they frame their predictions.

That being said, however, I'm still not a fan of the exercise. Citation counts aren't what's important in science and aren't the best way to measure impact. The Alt-Metrics project and many other initiatives have sprung up over the last few years looking for better ways to measure scientific impact than merely using citations.

Basically my position is that citation is a narrow way to gauge true impact and any project that relies primarily on citation data to predict prizes such as the Nobel is fundamentally flawed. I'm sincerely hoping that TR will soon reconsider their annual project.

So, let's see who they've predicted for this year. Note: I'm indicating with letters the teams of scientists that TR have grouped together in their predictions.

Chemistry

  • Allen J. Bard
  • Jean M. J. Fréchet (a)
  • Martin Karplus
  • Donald A. Tomalia (a)
  • Fritz Vögtle (a)

Physics

  • Alain Aspect (a)
  • John F. Clauser (a)
  • Sajeev John (b)
  • Hideo Ohno
  • Eli Yablonovitch (b)
  • Anton Zeilinger (a)

Physiology or Medicine

  • Robert L. Coffman (a)
  • Brian J. Druker (b)
  • Robert S. Langer (c)
  • Nicholas B. Lydon (b)
  • Jacques F. A. P. Miller
  • Timothy R. Mosmann (a)
  • Charles L. Sawyers (b)
  • Joseph P. Vacanti (c)

Economics

  • Douglas W. Diamond
  • Jerry A. Hausman (a)
  • Anne O. Krueger (b)
  • Gordon Tullock (b)
  • Halbert L. White, Jr. (a)

For the four prizes, that's 23 different people chosen in 13 groupings of one or more.

Let's see how they do this year. I predict about the same as previous years, in other words, some right and most wrong. Some of the people picked this year will be chosen for a Nobel this year. Some will get picked in a later year. Probably one or two people who get the Nobel this year will have been picked by TR in a previous year.

After all, over the years TR have picked so many different people that every year their odds improve that the Nobel committees will select someone TR has chosen in the past.

Once again, I would like to emphasize that I have nothing against the scholars whom TR has "nominated" and wish them well. I certainly don't mean to cast a negative light on their contributions to their fields at all. My beef is not with them, but with TR's misuse of their citation data.

5 responses so far

Going to JCDL2011: ACM/IEEE Joint Conference on Digital Libraries

I'll be at the 2011 ACM/IEEE Joint Conference on Digital Libraries at the University of Ottawa for the next few days. I plan on doing a bit of tweeting while I'm there but probably no live blogging.

I hope to have a summary post up here sometime after the conference with my impressions. Taking advantage of the relative proximity of Ottawa, this will be my first time at JCDL and I'm really looking forward to it. It's probably a bit more technical than I've been getting into recently but stretching the mind is always a good thing.

I hope to see some of you there. Definitely if you're there and you see me, come say Hi!

No responses yet

From the Archives: Everything is miscellaneous: The power of the new digital disorder by David Weinberger

I have a whole pile of science-y book reviews on two of my older blogs, here and here. Both of those blogs have now been largely superseded by or merged into this one. So I'm going to be slowly moving the relevant reviews over here. I'll mostly be doing the posts one or two per weekend and I'll occasionally be merging two or more shorter reviews into one post here.

This one, of Everything Is Miscellaneous: The Power of the New Digital Disorder, is from August 14, 2007. (Weinberger left a detailed comment at the original post, for those that are interested.)

=======

David Weinberger's Everything is Miscellaneous is one of 2007's big buzz books. You know, the book all the big pundits read and obsess over. Slightly older examples include books like Wikinomics or Everything Bad Is Good for You. People read them and mostly write glowing, fairly uncritical reviews. Like I said, Weinberger is the latest incarnation of the buzz book in the libraryish world. So, is the book as praiseworthy as the buzz would indicate or is it overrated? Well, both, actually. This is really and truly a thought provoking book, one that bursts with ideas on every page, a book I really found myself engaging and arguing with constantly, literally on every page many times. In that sense, it is a completely, wildly successful book: it got me thinking, and thinking deeply, about many of the most important issues in the profession, at times arguing every point on every page. On the other hand, there were times when it seemed a bit misguided and superficial in its coverage of the library world, almost gloatingly dismissive in a way.

So, I think I'll take a bit of a grumpy, devil's advocate point of view in this review. I am usually not shy pointing out flaws in the books I review, but this will probably be the first time I'm really giving what may seem to be a very negative review.

Before I get going, I should talk a little about what the book is actually about. Weinberger's main idea is that the new digital world has revolutionized the way that we are able to organize our stuff. In the physical world, physical stuff needs to be organized in an orderly, concrete way. Each thing in it's one, singular place. Now, however, digital stuff can be ordered in a million different ways. Each person can order their digital stuff anyway they want, and stuff can be placed in infinite different locations as needed. This paradigm shift is, according to Weinberger, a great thing because it's so much more useful to be able to find what we need if we're not limited in how we organize in in the physical world. In other words, our shelves are infinite and changeable rather than limited and static. Think del.icio.us rather than books on a bookstore shelf.

Weinberger is sort of the anti-Michael Gorman (or perhaps Gorman is the anti-Weinberger?) in that the former sees all change brought about by the "new digital disorder" as almost by definition a good thing. Whereas Gorman sees any challenge to older notions of publishing, authority and scholarship as heresy, with the heretics to be quickly burnt at the stake. Now, I'm not that fond of either extreme but I am generally much more sympathetic to Weinberger's position; the idea that we need to adjust to and take advantage of the change that is happening, to resist trying to bend it to our old-fogey conceptions and to go with the flow.

So, what are my complaints? I think I'm more or less going to take the book as it unfolds and make the internal debates I had with Weinberger external and see where that takes us. Hopefully, they're not all just a cranky old guy pining for the good old days but that we can all learn something from talking about some of the spots where I felt he could have used better explanations or substituted real comparisons for the setting up and demolishing of straw men.

The first thing that bothers me is when he compares bookstores to the Web/Amazon (starting p. 8). Bookstores are cripplingly limited because books can only be on one shelf at a time while Amazon can assign as many subjects as they need plus they have amazing data mining algorithms that drive their recommendation engines, feeding you stuff you might want to read based on what you've bought in the past and/or are looking at now. First of all, most bookstores these days have tables with selected books (based on subject, award winning, whatever) scattered all over the place, highlighting books that they think deserve (or publishers pay) to be singled out. On the other hand, who hasn't clicked on one of Amazon's subject links only to be overwhelmed by zillions of irrelevant items. It works both ways -- physical and miscellaneous are different; both have advantages and disadvantages. After all, the online booksellers only get about 20% of the total business, so people must find that there's a compelling reason to go to physical bookstores.

Starting on page 16, he begins a comparison of the Dewey decimal system libraries use to physically order their books with the subject approach Amazon and other online systems use. I find this comparison more than a bit misleading, almost to the point where I think Weinberger is setting up a straw man to be knocked down. Now, I'm not even a cataloguer and I know that Dewey is a classification system, a way to order books physically on shelves. It has abundant limitations (which Weinberger is more than happy to point out ad nauseum) but it mostly satisfies basic needs. One weakness is, of course, that it uses a hopelessly out of date subject classification system as a basis for ordering. Comparing it to the ability to tag and search in a system like Amazon or del.icio.us is, however, comparing apples to oranges. Those systems aren't really classification systems but subject analysis systems. The real comparison, to be fair, to compare apples to apples, should have been Amazon to the Library of Congress Subject Headings. While LCSH and the way it is implemented are far from perfect, I think that if you compare the use of subject headings in most OPACs to Amazon, you will definitely find that libraries don't fare as poorly as comparing Amazon to Dewey and card catalogues. And page 16 isn't the only place he get the Dewey/card catalogue out for a tussle. He goes after Dewey again starting on page 47; on 55-56 he talks as if the card catalogue is the ultimate in library systems; on 57 he refers to Dewey as a "law of physical geography;" on page 58 he again compares a classification system to subject analysis. And on page 60 he doesn't even seem to understand that even card catalogues are able to have subject catalogues. The constant apples/oranges comparison continued for a number of pages, with another outbreak on page 61-2, as he once again complains that Dewey can only represent an item in one place while digital can represent in many places; really the fact that Weinberger doesn't realize that libraries use subject headings as well as classification and that an item can have more than one subject heading, well I find that a bit embarrassing for him, especially at the length he does on about it. Really, David, we get it. Digital good, physical bad. Tagging good, Dewey bad. Amazon good, libraries & bookstores bad.

It was at this point that I thought to myself that in reality, even Amazon has a classification system like Dewey, in fact they probably have a lot of them. For example, the hard drives on their servers have file allocation tables which point to the physical location of their data files. At a higher level, their relational databases have primary keys which point to various data records. Even their warehouses have classification systems, as their databases must be able to locate items on physical shelves. Compare using a subject card catalogue to find books on WWII with being dropped in the middle of a Amazon warehouse! He sets up the card catalogue as a straw man and he just keeps knocking it down and it get tiresome that way he just keeps on taking easy shots.

Weinberger also misunderstands the way people use cookbooks (p.44). Sure, if people only used cookbooks as a way of slavishly copying recipes for making dinner, then, yeah, the web would put them out of business. But, people use cookbooks for a lot of reasons: to learn techniques, to get insight into a culture and way of life, to get a quick overview of a cuisine or a style of cooking, as a source basic information for improvising, to read for fun, to get a insight into the personality and style of a chef, to get an insight into another historical period. The richness of a good cookbook isn't limited by just recipes.

I have to admit that at this point I was tempted to abandon the book altogether, to brand it as all hype and no real substance, a hoax of a popular business book perpetrated on an unexpecting librarian audience. Fortunately, I didn't. There were more annoyances, but the book got a lot stronger as it went along, more insightful and more penetrating in it's analysis. However, I think I'll stay grumpy. (hehe.)

One of the more annoying arguments (p. 144) that I often encounter in techy sources is that the nature of learning and the evaluation of learning has changed so radically that we will no longer want to bother evaluating students on what they actually know and can do themselves, but rather will only test them on what they can do in teams or can use the web to find out. In other words, not testing without cell phones and the Internet at the ready. Now, I'm not one to say that we should only test students on memorized facts and regurgitated application of rote formulas; and I think you'd be hard-pressed to find many schools that only do that. From my experience, collaboration and group work, research and consultation are all encouraged at all levels of schooling and make up a significant part of most students' evaluation. Students have plenty of opportunity to prove they can work in teams and can find the information they need in a networked environment. But, I still think that it's important for students to actually know something themselves, without consultation, and to be able to perform important tasks themselves, without collaboration. Certainly, the level of knowledge and tasks will vary with the age/grade of the students and the course of study they are pursuing. If someone is to contribute to the social construction of knowledge they, well, need to already have something to contribute. In fact, if everyone always only relied on someone else to know something, then the pool of knowledge would dry up. The book asks some important questions: what is the nature of expertise, what is an expert, how do you become an expert, are these terms defined socially or individually, how is expert knowledge advanced, how is expert knowledge communicated? A scientist who pushes the frontiers of knowledge must actually know where they are to begin with. At some level, an engineer must be able to do engineering, not just facilitate team building exercises.

And little bits of innumeracy bug me too. On page 217 he's trying to make the point that the online arXiv has way more readers than the print Nature. ArXiv has "40,000 new papers every year read by 35,000 people" and "Nature has a circulation of 67,500 and claims 660,000 readers -- about 19 days of arxiv's readers." Comparing these two sets of numbers is a totally false comparison. What you really need to do is compare the total download figures for arXiv to the total download figures for Nature PLUS an estimate for the total paper readership. For arXiv does he think all 40K papers are read by each of the 35K readers for a potential 1.4 billion article reads? The true article readership is probably much, much smaller than that. As for the print, the most recent Nature (v744i7148) has 14 articles and letters; for a guestimate for a whole year print, multiply by 52 weeks and 660,000 readers equals a potential 480 million article reads; probably not everyone reads each article, but at least most probably at least glance at each article. For the print only. He doesn't even seem to realize that Nature, like virtually every scientific journal, has an online version with a potentially huge readership, which Weinberg in no way takes into account. It's clear to me that, at least based on the numbers he gives, what I can actually say about the comparison between the readerships for Nature and arXiv is limited but that they may not be too dissimilar. Not the point he wants to make, though. Again, the real numbers he should have dug up, but did not seem to want to use, was the total article downloads for each source.

Now, I'm not implying that print is a better format for science communication than online -- I've predicted in my My Job in 10 Years series that print will more or less disappear within the next 10 years -- but please, know what you're talking about when you explore these issues. Know the landscape, compare apples to apples.

I find it frustrating that in a book Weinberg dedicates "To the Librarians" he doesn't take a bit more time to find out what librarians actually do, how libraries work in the 2007 rather than 1950. (See p. 132 for some cheap shots) But in the end, I have to say it was worth reading. If I disagreed violently with something on virtually every page, well, at least it got me thinking; I also found many brilliant insights and much solid analysis. A good book demands a dialogue of it's readers, and this one certainly demanded that I sit up and pay attention and think deeply about my own ideas. This is an interesting, engaging, important book that explores some extremely timely information trends and ideas, one that I'm sure that I haven't done justice to in my grumpiness, one that at times I find myself willfully misunderstanding and misrepresenting (misunderestimating?). I fault myself for being unable to get past it's shortcomings in this review; I also fault myself for being unable to see the forest for the trees, for being overly annoyed at what are probably trivial straw men. Read this book for yourself.

(And apologies for what must be my longest, ramblingest, most disorganized, crankiest, least objective review. I'm sure there's an alternate aspect of the quantum multiverse where I've written a completely different review.)

Weinberger, David. Everything Is Miscellaneous: The Power of the New Digital Disorder. New York: Times Books, 2007. 277pp.

No responses yet

McMastergate in chronological order, or, Do libraries need librarians? (Updated!)

So, here's the story. A week or so ago, McMaster University Librarian Jeff Trzeciak gave an invited presentation at Penn State, tasked by the organizers to be controversial.

To say the least, he succeeded. Perhaps the most controversial idea in the presentation was that he would basically no longer hire librarians for his organization, only subject PhDs and IT specialists.

As you can imagine, the library blogosphere and Friendfeedosphere has had a field day with this one.

You can see the slide in question here and get a bit of a background on the situation of librarians at McMaster here.

What follows is a chronological list of all the relevant posts I've been able to find.

There's a pretty lively debate on The Future of Librarianship going on in this Google Doc. Join in!

As usual, if you know of any relevant posts or other online documents that I've missed, please let me know in the comments or at jdupuis at yorku dot ca.

I'm still ruminating about my own response to this and will probably get something up in the not-too-distant future.

(And yes, this is somewhat related to that Future of Academic Libraries conference mita and I wrote about recently.)

Update 2011.04.15: Added a few new posts from April 14 & 15.
Update 2011.04.16: Added a couple of new posts from April 15 & 16 and a link to the Google Docs debate.
Update 2011.04.19: Added a couple of new posts up to April 19 & reposted with that date. I've also backloaded some posts on the Future of Academic Libraries Symposium which I think is related enough to include here.
Update 2011.04.20: Added another item from April 19.
Update 2011.04.21: Added a couple from April 20.
Update 2011.04.24: Added a couple covering up to April 24. I've also expanded the topic to include some posts on online civility that grew out of this and other controversies.
Update 2011.04.28: Added a couple up to April 28.
Update 2011.05.16: Added a couple up to May 16, as the McMaster conference approaches. Also added a straggler from April 21.
Update 2011.05.17: Added a few more from May 15-17. Also, reposted to today's date for the McMaster symposium.
Update 2011.05.20: Added a few more up to May 20.
Update 2011.06.01: Added a few more up to May 27 and a few earlier stragglers.
Update 2011.07.07: Added a few more up to July 5 and a few earlier stragglers.
Update 2011.12.14: Added a few more up to December 14, including a link to another link dump post on the The Academic Librarianship -- A Crisis or an Opportunity? symposium. I may copy those symposium-related posts here at some point.
Update 2012.02.29: Brought up to date with the announcement that Jeff Trzeciak is leaving McMaster.

22 responses so far

Computer Science, Web of Science, Scopus, conferences, citations, oh my!

The standard commercial library citation tools, Web of Science (including their newish Proceedings product) and Scopus, have always been a bit iffy for computer science. That's mostly because computer science scholarship is largely conference-based rather than journal-based and those tools are tended to massively privilege the journal literature rather than conferences.

Of course, these citation tools are problematic at best for judging scholarly impact in any field, using them for CS is even more so. The flaws are really amplified.

A recent article in the Communications of the ACM goes through the problems in a bit more detail: Invisible Work in Standard Bibliometric Evaluation of Computer Science by Jacques Wainer, Cleo Billa and Siome Goldenstein.

A bit about why they did the research:

Multidisciplinary committees routinely make strategic decisions, rule on subjects ranging from faculty promotion to grant awards, and rank and compare scientists. Though they may use different criteria for evaluations in subjects as disparate as history and medicine, it seems logical for academic institutions to group together mathematics, computer science, and electrical engineering for comparative evaluation by these committees.

*snip*

Computer scientists have an intuitive understanding that these assessment criteria are unfair to CS as a whole. Here, we provide some quantitative evidence of such unfairness.

A bit about what they did:

We define researchers' invisible work as an estimation of all their scientific publications not indexed by WoS or Scopus. Thus, the work is not counted as part of scientists' standard bibliometric evaluations. To compare CS invisible work to that of physics, mathematics, and electrical engineering, we generated a controlled sample of 50 scientists from each of these fields from top U.S. universities and focused on the distribution of invisible work rate for each of them using statistical tests.

We defined invisible work as the difference between number of publications scientists themselves list on their personal Web pages and/or publicly available curriculum vitae (we call their "listed production") and number of publications listed for the same scientists in WoS and Scopus. The invisible work rate is the invisible work divided by number of listed production. Note that our evaluation of invisible work rate is an approximation of the true invisible work rate because the listed production of particular scientists may not include all of their publications.

A bit about what they found:

When CS is classified as a science (as it was in the U.S. News & World Report survey), the standard bibliometric evaluations are unfair to CS as a whole. On average, 66% of the published work of a computer scientist is not accounted for in the standard WoS indexing service, a much higher rate than for scientists in math and physics. Using the new conference-proceedings service from WoS, the average invisible work rate for CS is 46%, which is higher than for the other areas of scientific research. Using Scopus, the average rate is 33%, which is higher than for both EE and physics.

CS researchers' practice of publishing in conference proceedings is an important aspect of the invisible work rate of CS. On average, 82% of conference publications are not indexed in WoS compared to 47% not indexed in WoS-P and 32% not indexed in Scopus.

And a bit about what they suggest:

Faced with multidisciplinary evaluation criteria, computer scientists should lobby for WoS-P, or better, Scopus. Understanding the limitations of the bibliometric services will help a multidisciplinary committee better evaluate CS researchers.

There's quite a bit more in the original article, about what their sample biases might be, some other potential citation services and other issues.

What do I take away from this? Using citation metrics as a measure of scientific impact is suspect at best. In particular (and the authors make this point), trying to use one measure or kind of measure across different disciplines is even more problematic.

Let's just start from scratch. But more on that in another post.

No responses yet

From the Archives: Glut: Mastering information through the ages by Alex Wright

Apr 24 2011 Published by under book review, information science, science books

I have a whole pile of science-y book reviews on two of my older blogs, here and here. Both of those blogs have now been largely superseded by or merged into this one. So I'm going to be slowly moving the relevant reviews over here. I'll mostly be doing the posts one or two per weekend and I'll occasionally be merging two or more shorter reviews into one post here.

This one, of Glut: Mastering Information Through the Ages, is from March 25, 2008.

=======

This book should have been called Everything is not Miscellaneous. In fact, this book could be imagined as Weinberger's Everything is Miscellaneous as written by a slightly old-fashioned librarian.

Book-jacket blurb descriptions aside, Alex Wright's Glut is a fascinating look at the history and methods human culture has used to organized and categorize knowledge and information. More Academic in tone than Weinberger's book, it's a bit dryer and more serious and, of course, a little less prone to unsubstantiated hype. This very clearly not a book aimed at the business audience; you will find no strategies within to make your customers buy more virtual widgets.

Let's have a taste (p. 3-4):

My aim in writing this book is to resist the tug of mystical techno-futurism and approach the story of the information age by looking squarely backward. This is a story we are only just beginning to understand...[W]e are just starting to recognize the contours of a broader information ecology that has always surrounded us. Just as human being had no concept of oral culture until they learned how to write, so the arrival of digital culture has given us a new reference point for understanding the analog age.

Overall, Wright has quite a strong humanities focus in the book with lots about religion, philosophy and literature as well as the history of writing, printing, libraries and how people deal with information. Books and libraries are the main focus. Not so much about biology, physics, chemistry and astronomy. Even the biology chapter is more a mythological or sociological treatment of taxonomy rather than an emphasis of the scientific systems. It really gives the impression that scientists don't classify or organize, only humanists. The second half of the book is better, but a general treatment of the organization of information is missing a lot if it doesn't include the periodic table or the various number systems. Astronomical tables, navigational charts, fossils, chemical names and descriptions, genomic data, all are extensive systems of organized scientific data. Wright also doesn't pay too much attention to non-Western systems of organization.

So, let's do a quick drive-by series of impressions to get the main points. We start with a brief introduction of information networks and hierarchies in both the natural and human information space and then into some discussion of folk taxonomies and the relationship of categorizations to family structures. We then get into the means of transferring information symbolically in pre-literate cultures and the development of written cultures through alphabets. The role of classical Greek culture is stressed and the Library of Alexandria is name-checked.

Next up, Irish monks save civilization and the role of books and libraries in those efforts during the dark ages. The printing press arrives, increasing the distribution of books and fixing texts in time and space. Next we explore Bacon, Wilkins, memory and the role of philosophers. The enlightenment and the development of the scientific methods follows, as does the popularity of encyclopedic projects. Lineaus versus Buffon and taxonomy.

As book production grows and libraries expand, librarians must systematize the ordering of the collections and we see Dewey and Cutter make their entrances. Here we really do see that in a library, everything is not miscellaneous. Now the true here of the book makes his appearance -- Paul Otlet! His amazing accomplishments during World War II are examined and explored, followed by the contributions of Vanevar Bush and his Memex. Eugene Garfield, Ted Nelson, some Andrew Keen-like gnashing of teeth (p. 227-229). Jumping to the modern internet era the web is a place to talk, we see almost the re-emergence of old fashioned oral patterns of communication and increasing tensions between oral and literary cultures.

So, on the whole, what do I think? Wright's book is a pretty good summary of what libraries and librarians have done over the years. It's not so good at looking at what's been done outside the humanities. In fact, in the final chapter I sense a bit of a disdain for computer science people and technologists in general.

Also a bit of obliviousness (p. 201):

Web browsers are ultimately unidirectional programs, designed primarily to let users consume information from a remote source. To the extent that users can create information through a web browser, they must do so through the mediation of that remote source. As an authoring tool, the web browser is all but impotent.

It's hard to imagine three sentences that could destroy the credibility of a book on web and information culture more in 2008 than those three.

Today most of us experience personal computers as fixed entities, with hierarchical folders and a familiar set of programs. Our computers are not so far removed from the dumb terminals of the mainframe era. The know very little about us. [Vanevar] Bush's vision suggests the possibility of smarter machines that could anticipate our needs and adapt themselves to our behaviours, like good servants. (p. 202-203)

Of course, this vision has been around for quite a while in the form of the data mining technologies so widely used by Google, Amazon and others to actually find out so much about our wants and needs.

Even so, Wright's efforts do repay close attention, with lots of good analysis and history if perhaps a bit limited in scope and reach. I would certainly suggest that anyone interested in where the information landscape has been and where it's going read this book. You may not agree with it, but it will get you thinking.

(Book supplied by publisher.)

Wright, Alex. Glut: Mastering: Mastering information through the ages. Washington: Joseph Henry Press, 2007. 286pp. ISBN-13: 978-0801475092

One response so far

Interview with The Tweeting Chancellor, Holden Thorp of the University of North Carolina

Welcome to the latest instalment in my occasional series of interviews with people in the world of higher education and scholarly publishing.

This time around it's a bit different with the circumstances being a little unusual. Last week I did a back-of-the-envelope tweet about the Twitter habits of senior academic administrators and my experiences creating a list of those administrators. The uses of social networks in education is an area that really interests me and the habits of those senior administrators was something I'd been wondering about.

Well, my old blogging buddy Stephanie Willen Brown saw the post and tweeted it in her capacity as the head of the UNC Journalism and Mass Communications Library, copying the Twitter handle of Holden Thorp, the chancellor of UNC. Well, to make a long story short, Chancellor Thorp saw the tweet and he and I ended up connecting for a short interview.

I'd like to thank Chancellor Thorp for agreeing to this interview and also props to Stephanie for making the connection.

Enjoy!

===============

Q1. When did you start tweeting?

In December of 2010. Here's a blog entry that explains a lot of my interest and approach to Twitter.

(JD: Here's Chancellor Thorp's handle: @chanthorp.)

Q2. What was your initial rationale for getting into the whole social media arena, Twitter especially?

We have a few students who are very interested in higher education - @elizakern, @kkiley, and @cryanbarber. They were putting a lot of interesting student perspectives about events in higher ed on Twitter. I was lurking reading their stuff, because it was less formal than what would end up in the student newspaper, and I thought very insightful. I got tired of typing their names in all the time and decided to set up an account for myself. As described in the blog post mentioned above, I didn't want to set up a phony account.

Q3. How do you decide what to tweet? How do you balance promoting your institution and it's activities with the kind of authentic, personal touch that this kind of platform really requires?

I try to create a balance. Certainly sending out links of positive news about the university or the students is a winner. Innovation is my area so I send out stuff about that and follow a lot of people who write in the area like Steven Johnson, Atul Gawande, Steve Case, Lesa Mitchell, Rick Florida, Dan Pink, Maureen Farrell. I retweet a lot of their stuff. I retweet stuff from the students, but only if I have time to go through the links that are in the tweet carefully. I send out things about our sports teams, but try to stay positive (see below). If I'm at a non-revenue sporting event and there is no other person tweeting, I will live-tweet the game. On the personal side, I send a little bit of stuff about my kids out and a few family events from time to time.

Q4. The Internet can be a bit of a rough and tumble place at times. Have you had any less than wonderful experiences and what's your theory on how to handle such things?

Sports offers a lot of possibilities for getting in trouble on the internet in general, and on Twitter in particular. I sent out a tweet on the day of the Duke-Carolina game that was over the line in kidding Duke students. I shouldn't have done it and I apologized, although I did get a lot of new followers that day. Unfortunately, controversy gets you a lot of attention online and that is a big danger that everyone should be aware of.

Q5. Finally, would you recommend getting on Twitter to other senior academic administrators?

Yes. I cannot think of a better way to stay in touch with the students and understand their perspective. @kstate_pres and @ are both doing a really good job.

No responses yet

Older posts »