Archive for the 'escience' category

Reading Diary: Data Management for Researchers: Organize, maintain and share your data for research success by Kristin Briney

Kristin Briney's Data Management for Researchers: Organize, maintain and share your data for research success is a book that should be on the shelf (physical or virtual) of every librarian, researcher and research administrator. Scientists, engineers, social scientists, humanists -- anyone who's work involves generating and keeping track of digital data. This is the book for you.

Like the title says -- data management for researchers. If you have data and you're a researcher, this is the book for you. Organize, maintain and share, the title says. If you're a researcher that needs to manage data, organizing, maintaining and sharing that data is exactly what you want to do.

And Kristin Briney is just the person to help. With a PhD in chemistry, you know she's been on the researcher side of the equation. And with a Master's in Library and Information Studies, you also know that she's studied the managing/organizing/sharing side of the equation and can bring deep insight and solid advice there too.

And that's the focus of the book -- insight and advice. Insight into the problems and issues around dealing with data and advice with how to deal with them.

The chapter topic areas give a good sense of the topics covered, so I don't have to go into detail with explanations of what's covered:

  • The data problem
  • The data lifecycle
  • Planning for data management
  • Documenting your data
  • Organizing your data
  • Improving data analysis
  • Managing secure and private data
  • Short-term storage
  • Preserving and archiving your data
  • Sharing/publishing your data
  • Reusing data

Briney covers a lot of ground and goes into pretty deep detail for most areas. Inevitably, not every section will be equally relevant to every potential reader and not every detail or discussion will be new information to everyone. Given breadth of topics and the level of detail in each area and that Briney mostly starts each section from square one, this book will work for everyone at pretty well every skill level.

Some judicious skimming will be inevitable for most potential readers, as will perhaps some selective Googling for addition background information in certain area. Briney has you covered. In fact, an interesting way to deal with the detail might be by taking this book in two passes. The first pass to get a sense of the "universe of data things you need to know" and a second more focused on "what I need to know to survive my current situation." Whether that situation is a librarian hoping to build a data service, a PI hoping to get a little better at the things an onrushing funder mandate is going to require or a grad student ready to tackle their first real project, all the information you need is there. You just have to zero in on it.

That being said, the sections on data management plans, preserving & archiving and sharing data are all must-read sections for everyone. Making research data openly available where possible, for reuse and replication purposes, is an important goal for, in particular, all of science.

I recommend this book without hesitation for all academic libraries. Individual researchers, research administrators, funding agency employees and academic librarians would all find much useful information. Simply giving copy to new graduate students is probably a worthwhile investment at any institution.

Briney, Kristin. Data Management for Researchers: Organize, maintain and share your data for research success. Exeter, UK: Pelagic Publishing, 2015. 250pp. ISBN-13: 978-1784270117

(Review copy provided by publisher.)

No responses yet

Recent Presentations: Getting Your Science Online and Evaluating Information

As I mentioned way back on October 22nd, I was kindly invited to give a talk at the Brock University Physics Department as part of their seminar series. The talk was on Getting Your Science Online, a topic that I'm somewhat familiar with! Since it was coincidentally Open Access Week, I did kind of an A-Z of online science starting with the various open movements: access, data and notebooks. From there I did a quick tour of the whys and wherefores of blogs and Twitter.

There was a good turnout of faculty and grad students with lots of great questions and feedback, some more skeptical that others but definitely stimulating and, I hope, worthwhile.

Here are the slides:

Thanks again to Thad Haroun and the Brock Physics Department for inviting me!

And the other notable presentation was just yesterday, part of my intervention in a section of one of York's science-for-non-science-majors courses, Natural Science 1700 Computers, Information and Society. The prof, Dov Lungu, and I collaborated on a three-part Information Literacy section for the course. In my three one-hour sessions I covered some of the basics of surviving the information needs of university life and in the second part, a fairly typical library session on how to find resources for the class. The third part was a bit more interesting in that Dov gave me free reign to talk about evaluating information online, pretty well any way I wanted.

I wouldn't normally bother to share my course materials here on the blog, but I rather like the presentation I used and I thought it went over fairly well. The various ridiculous examples I used worked well to spark a bit of discussion in quite a large class.

As usual, I appreciate any feedback.

No responses yet

Getting Your Science Online: Presentation at Brock University Physics Department

It seems that Brock University in St. Catherine's, Ontario really likes me. Two years ago, the Library kindly invited me to speak during their Open Access Week festivities. And this year the Physics Department has also very kindly invited me to be part of their Seminar Series, also to talk about Getting Your Science Online, this time during OA Week mostly by happy coincidence.

It's tomorrow, Tuesday October 23, 2012 in room H313 at 12:30.

Here's the abstract I've provided:

Physicist and Reinventing Discovery author Michael Nielsen has said that due to the World Wide Web, “[t]he process of scientific discovery – how we do science – will change more over the next 20 years than in the past 300 years.” Given the cornucopian nature of the Web, there’s a tool or strategy that will help most everyone with concrete goals in their research program. In this session we’ll take a look at some of the incredible opportunities the Web has opened up for scientists, from Open Access publishing, Open Notebooks and Open Data on the one hand, to blogs, Twitter and Google+ on the other.

It looks to be great fun! I'd like to thank Thad Haroun and the Brock Physics Department for kindly inviting me. I'll post the slides on the blog later this week.

The rest of this post is what I guess would count as "supplemental materials" for the presentation. The first chunk is a bunch of links on the main topics of the presentation, all the various "Open Whatever" topics. The rest is a long list of readings on blogs, Twitter and general social media for academics that I've assembled over the years and at least somewhat digested into the presentation, mostly based on a post from last year but will some extra and more recent items added.
Open Access

Open Access Mandates & policies

Open Access Repositories


Open Data

Open Notebook Science

Blogging networks


Blog Aggregators

Some physics & math blogs

And here are the blogs/twitter/social media resources I promised.

Feel free to add any suggestions in the comments.

2 responses so far

Around the Web: More on open data, textmining the literature and The Panton Principles

My colleagues and I are taking our Creative Commons/Panton Principles presentation on the road to another library conference this winter. As a result, I'm still compiling more references on the topic so I thought I share what I've found recently with all of you.

Of course, suggestions for more resources are always welcome in the comments.

Some more articles at BioMed Central.

(Yes, blogging has been pathetic of late. I hope to have a decent post up this week and maybe a return to more normal form in the fall.)

Update 2013.01.30: Some followup posts with more resources and presentations I've done here, here, here and here.

No responses yet

Open Data & The Panton Principles: Thoughts on a presentation to librarians

As I mentioned last week, on Tuesday, April 17 I was part of a workshop on Creative Commons our Scholarly Communications Committee put on for York library staff. My section was on open data and the Panton Principles. While not directly related to Creative Commons, we thought talking a bit about an application area for licensing in general and a specific case where CC is applied would be interesting for staff. We figured it would be the least engaging part of the workshop so I agreed to go last and use any time that was left.

Rather unexpectedly, the idea of data licensing and in particular CC0 licensing for data ended up being the topic that most energized the crowd! So we bumped up my part and I ended up going second-to-last. My section sparked a lot of very interesting conversations and feedback from a pretty packed house.

So much so, that while riding home on the bus on Friday with a colleague, she mentioned that the issues I'd talked about on Tuesday had come in handy at the conference she'd attended on campus earlier on Friday. She'd been able to speak intelligently and provocatively about the usefulness of open data to public policy!

So, a huge win.

Lessons learned? I think if I were doing the presentation over again tomorrow, I'd emphasize the practice of making data openly accessible should be considered as outside the normal scholarly communications system. It isn't just for pirates and thieves. The goal is to make data sharing a standard practice. The means to that end is to ensure data sets are cited in the literature and by extension to have data sharing become an accepted part of the normal academic reward and incentives structure. You create data, you share it, someone else uses it for their research, they cite your data set in their paper, that citation is counted with the same weight as a citation to a paper.

And within that understanding, I think I also would have emphasized more that it's just the right thing to do. Sure, you can fear being scooped with your own data, that someone will replicate your claims and try and take the credit, sure someone might even try and claim that they created your data themselves. But these "risks" should be seen as no different from the risks of publishing anything -- a journal article, a blog post, some code.

But those are far outweighed by the great potential of making scientific data open.

In any case, that's for next time. And hopefully there will be a next time. We're definitely hoping to take our workshop to a conference somewhere.

I don't believe any of the other presentations by my colleagues are online, but I'll link to them here if I find them.

And speaking of presentations, here are my slides:

I'll note that I'm waiving all rights to the slides and releasing them with a CC0 waiver. So have at them!

Also, here are the resources I used for my presentation as well as a few more that have come to light since my original post.

And some new ones:

As before, any suggestions for further resources would be greatly appreciated in the comments.

One response so far

Around the Web: Some resources on the Panton Principles & open data

As part of a workshop on Creative Commons, I'm doing a short presentation on Open Data and The Panton Principles this week to various members of our staff. I thought I'd share some of the resources I've consulted during my preparations. I'm using textmining of journal articles as a example so I'm including a few resources along those lines as well.

Please feel free to suggest additional resources in the comments.

Update 2013.01.30: Some followup posts with more resources and presentations I've done here, here, here and here.

2 responses so far

Interview with Michael Nielsen, author of Reinventing Discovery: The New Era of Networked Science

Welcome to the latest installment in my very occasional series of interviews with people in the scitech world. This time around the subject is Michael Nielsen, author of the recently published Reinventing Discovery: The New Era of Networked Science and prolific speaker on the Open Science lecture circuit. A recent example of his public speaking is his TEDxWaterloo talk on Open Science.

You can follow his blog here and read his recent Wall Street Journal article, The New Einsteins Will Be Scientists Who Share.

I'd like to thank Michael for his provocative and insightful responses. Enjoy!

Q0. Hi Michael, would you mind telling us a little about yourself and how you ended up writing and speaking about open science?

My original training is as a theoretical physicist --- I worked on quantum computing and related topics full time for about 13 years, and part time for a few years prior to that.

But at the same time as I was working on quantum computing, I was also following closely all the amazing things happening online -- things like the development of Google, Wikipedia, open source software, and so on. And as I watched it came to seem to me that these tools have begun (though far from concluded!) a revolution in the way we construct knowledge.

For a long time I expected that tools like this would also revolutionize how science is done. And we've certainly seen some exciting developments along those lines. But overall scientists have been very conservative in how they've adopted new online tools, in large part because of cultural barriers in science, barriers that mean scientists don't get a whole lot of credit for sharing knowledge in new ways.

I found this conservatism frustrating, and wanted to work to help change the culture of science. So in 2007 I decided to leave my tenured position as an academic to work full time on open science.

Q1. Your new book is Reinventing Discovery: The New Era of Networked Science. Briefly, what is it about and what is the intended audience?

It's about the potential of the network to transform the way scientific discoveries are made. I think the day-to-day process of science will dramatically shift over the next few decades, speeding up the rate at which discoveries are made, and making possible whole new ways of attacking problems. But that will only happen if the culture of science becomes much more open -- to reach its potential networked science must also be open science. And so the book is also a manifesto for open science.

Q2. A significant percentage of the people doing science out there are academics. It's easy to see how open science integrates into the research part of their jobs but how about into their teaching and service requirements?

There are many things academics can do to integrate open science into teaching and service. Here's just a few ideas:

  1. Contribute to projects like Wikipedia and Citizendium, perhaps by giving students projects to improve articles in particular areas.
  2. Academics can potentially combine research, teaching and outreach through projects such as Zooniverse, which is becoming a general purpose platform for connecting scientists to the general public, so the public can make real contributions to scientific research projects. Zooniverse are probably best known for Galaxy Zoo, a very successful project to crowdsource galaxy classifications, but they also run many other citizen science projects.
  3. Academics can upload some of their teaching materials online, where they can be used by others. Aside from the intrinsic worthiness of doing this, it can certainly help improve teaching. YouTube, for example, gives detailed analytics -- you actually get a graph showing how much attention people pay to different parts of your video. From painful personal experience I can say that sometimes that graph plummets, as people leave your video in droves. Usually that's a great diagnostic that you're messing something up in your explanation, and need to improve.

Once you start looking into these and other similar possibilities, you realize that there are a multitude of ways to incorporate open science into the classroom and into service. Many of these ways are free or inexpensive, with the main limit being imagination.

Q3. It seems to me that the key to changing the way science is done is changing the incentive structures for working scientists. What could a new incentive structure look like that would encourage more openness? Are there some practical steps that can move things forward?

This is a question that an entire book could easily be written about. With that said, here's a few things that can be done:

  1. Individual scientists can make a point of citing non-traditional research contributions, like open data sets, code, and videos. Eventually we'll see journals that make it possible to publish data, code and video as first-class research objects in their own right, with the same status as conventional paper publication. Some efforts in this direction include GigaScience, Open Research Computation and the Journal of Visualised Experiments. Citations to those contributions will then show up in conventional measures of academic productivity --- things like citation count --- and so give people an incentive to contribute in new ways.
  2. People can build tools to measure the impact of non-traditional research contributions. The SPIRES service helped drive the adoption of preprint culture in physics, by providing a way of measuring the impact of preprints. There's no reason similar services shouldn't be set up for contributions to blogs, wikis, question and answer sites like MathOverflow, and so on. Indeed, MathOverflow already has a tool like this built in --- a measure of reputation for users. And there's other ideas exploring this space, like altmetrics and total impact. Do these replace conventional measures, like total number of citations? No, of course not. But people are often surprisingly aware of such reputation measures, and they will gradually enter the mainstream, show up on people's CVs, and so on.
  3. People who work at grant agencies or in senior positions in academia can help legitimize new forms of contribution. Simply inviting scientists to submit non-traditional evidence of impact would be a good start.

These are all small but significant steps, and it's through such steps that a change to a more open scientific culture will gradually come about.

Q4. Or perhaps the key is to get them young: how do we need to change the training and mentoring of scientists get to encourage them to be more open?

I don't think there's anything terribly complicated required here. Just getting students involved in open science projects is a big help. People like Steve Koch have mentored students like Andy Maloney, who've done much of their work in the open. Those students then go off and carry those techniques elsewhere, slowly changing the overall culture of science.

Q5. Perhaps the classic example you use in your talks is the Polymath project -- an experiment in massively collaborative mathematics. Do you see a future for this type of project and do you think the model is generalizable beyond mathematics?

Yes, I see a big future for this kind of project, although I think that Polymath and similar projects will morph into other forms. The original Polymath Project was done using off-the-shelf tools -- WordPress and Mediawiki -- that definitely aren't designed for massively collaborative mathematics. And so I think that we can develop much better tools, and also better social norms, that will make it possible to go much, much further.

To some extent this is already happening with the question and answer site Mathoverflow, which has attracted a strong and growing community of mathematicians. It's not uncommon to see a challenging technical question posted to Mathoverflow and then answered within minutes or hours.

As to whether this model is generalizable beyond mathematics, it certainly is, although with some qualifications. It depends on where the bottlenecks are in doing research. If the major bottlenecks are (say) construction of an experimental device, or taking samples, then obviously the net only helps a little. But if the bottlenecks are data analysis, or something more conceptual -- and I don't just mean theory, the bottlenecks in doing experiments are often conceptual -- then there is the potential for a networked approach to really help. What gets me excited is the fact that we're still in the very early days of this; there's a lot of room for people with imagination to go much, much further.

No responses yet

Issues in Science & Technology Librarianship, Summer 2011: E-Science Librarianship: Field Undefined

Another issue full of interesting articles:

If I may highlight one of the articles this time around, I think the E-Science Librarianship: Field Undefined is an interesting and worthwhile examination of job ads for (broadly defined) e-science librarian positions to try and get a handle on what exactly e-science librarianship is and what people in this newly defined area actually do.

The results confirm a definition of e-science librarian: someone who works collaboratively, and uses technology and library skills within the domain of science. Yet this definition is so vague, it does little to answer the question of "what is an e-science librarian" in terms of the actual roles, tasks, and positions of librarians involved in e-science. In fact, by taking a closer look at the job titles and the breakdown of positions in the sample, it becomes clear that e-science librarianship is not a defined field.


This breakdown into categories raises further questions; Will the number of data oriented librarian positions grow or will a hybrid data and subject librarian position become more common? Will the responsibilities currently handled by subject librarians increase to the extent that they become their own position? Will e-science be a standard means of science information work and its features become subsumed by existing positions so that no specific e-science position ever becomes defined? How will and to what extent will e-science methods grow and become universal? These of course are conjectures about the future, with no immediate answers. The answers depend on e-science itself and they will affect libraries and librarian training.


Even if e-science is being applied, whether or not librarians have a role in it and the type of role they have is still unclear.


Currently, it is impossible to know the degree to which e-science will be used and thus how much of a need there will be for anyone, including librarians, to engage e-science.


Organizations thinking of hiring an e-science librarian need to assess what e-science duties current staff can fill, what amount of data is being produced, and if it will need to be or will be required to be shared. Additionally, the state of e-science must continue to be monitored and studied so that those considering it can proceed in a smart and efficient way. E-Science may provide potential for librarians to branch out beyond the bounds of traditional library practices, while still dealing with the information management that characterizes library science. Yet, because e-science is not yet common practice, the library field must proceed into this new territory with caution.

(Emphasis mine.)

I'm not sure if I have anything profound to add to the above other than that none of it really surprises me.

My only worry is that the library field will enter this area with too much caution and just be totally too late to the party, irrelevant and unwanted. Of course, there are equal risks in entering the field, knocking on a bunch of doors, launching a bunch of initiatives and still being viewed as irrelevant and unwanted. It's the kind of thing where a bunch of places need to try a bunch of different things to see what sticks and hopefully why, allowing others to learn and perhaps replicate successes.

But I don't know about you, I'd rather die storming the mountain than sitting quietly in the base camp.

No responses yet

Some resources for reference assistant training in a scitech library

Trust me, I really tried to come up with a cool, funny title for this post.


We have a new reference assistant starting here next week. As somewhat typical for such a position, the new staff member has a science subject background rather than a library background. In this case, Maps/GIS.

So I thought it might be a good idea to gather together some resources for helping our new hire get acclimatised to reference work in an academic science & engineering library. After all, we're not born with the ability to do good reference interviews!

With the help of the fine folk in Friendfeed, I've gathered together some very good general sources. As well, I've trawled through the archives from Issues in Science & Technnology Librarianship and Science & Technology Libraries to find some other good articles.

Of course, please feel free to suggest other resources that might be of help in the comments. Anything related to reference or just general life in scitech libraries would be appreciated.



S&TL (Warning: all toll access articles.)


Yeah, this is a ton of reading. The point isn't that someone should memorize every word, most of the the articles probably only need to be scanned. What I'm hoping for is a list of resources that will help someone get acclimatized to reference service and hopefully become aware of many of the main issues around such services in academic libraries. As well, there's a bit in here about some general issues in academic libraries.

Over time I can imagine adding to and pruning the list. As well, I can also imagine highlighting a few key resources with the rest in a more supporting role.

As I said above, suggestions are more than welcome.

Update 2011.07.19. Thanks to DJF on Friendfeed for pointing out the original citation for the Oranges & Peaches item. It's Dewdney, Patricia, and Gillian Michell. 1996. "Oranges and peaches: Understanding communication accidents in the reference interview" RQ 35 (4): 520.

4 responses so far

Going to JCDL2011: ACM/IEEE Joint Conference on Digital Libraries

I'll be at the 2011 ACM/IEEE Joint Conference on Digital Libraries at the University of Ottawa for the next few days. I plan on doing a bit of tweeting while I'm there but probably no live blogging.

I hope to have a summary post up here sometime after the conference with my impressions. Taking advantage of the relative proximity of Ottawa, this will be my first time at JCDL and I'm really looking forward to it. It's probably a bit more technical than I've been getting into recently but stretching the mind is always a good thing.

I hope to see some of you there. Definitely if you're there and you see me, come say Hi!

No responses yet

Older posts »