• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • HOME
  • NEWS+OPINIONS
    • NEWS TO US
    • COLUMNS
      • APPARENT HORIZON
      • DEAR READER
      • Close
    • LONGFORM FEATURES
    • OPINIONS
    • EDITORIAL
    • Close
  • ARTS+ENTERTAINMENT
    • FILM
    • MUSIC
    • COMEDY
    • PERFORMING ARTS
    • VISUAL ARTS
    • Close
  • DINING+DRINKING
    • EATS
    • SIPS
    • BOSTON BETTER BEER BUREAU
    • Close
  • LIFESTYLE
    • CANNABIS
      • TALKING JOINTS MEMO
      • Close
    • WELLNESS
    • GTFO
    • Close
  • STUFF TO DO
  • TICKETS
  • ABOUT US
    • ABOUT
    • MASTHEAD
    • ADVERTISE
    • Close
  • BECOME A MEMBER

Dig Bos

The Dig - Greater Boston's Alternative News Source

SAVING SCIENCE, ONE DATASET AT A TIME

Written by SAUL TANNENBAUM Posted February 27, 2017 Filed Under: FEATURES, News, NEWS+OPINIONS, Tech

 

Scores of volunteers gathered in an MIT dining room on a sunny, unseasonably warm February Saturday with a singular mission: They were going to save scientific data they feared the Trump administration, through ideology or neglect, would remove from public access.

 

DataRescue Boston @ MIT is part of a rapidly-growing movement of volunteers — scientists, computer engineers, librarians and archivists — determined to copy and archive public scientific data before it might disappear. The data rescue movement is shepherded by Environmental Data and Governance Initiative (EDGI) and DataRefuge, both of which are networks of scientists  concerned about preserving scientific infrastructure. Work began at a “guerilla web archiving” event in Toronto on December 17 and has grown rapidly. On the same day as the MIT rescue event, groups in DC, Colorado, and Pennsylvania were also hard at work. While the original emergency focus was climate data, seen at special risk, the scope has grown to include larger sets of environmental data from the Environmental Protection Agency, the Department of the Interior, the Department of Energy, and the National Oceanic and Atmospheric Administration. Despite a history of less than two months, Data Rescues are highly organized events:

 

  • There are “surveyors,” people tasked with reviewing target web sites, mapping out the organization, and developing primers analyzing what was found.

 

  • “Seeders” take the surveyors’ primers and systematically go through each web site and, using a Chrome web browser extension, either  nominate a dataset for archiving or mark it for special attention.

 

  • “Harvesters,” in turn, review each dataset that requires attention and determine methods to capture them. Datasets that can be harvested by routine methods end up at the Internet Archive’s End of Term archive, while datasets that require special handling are destined for the DataRefuge archive.

 

  • Lastly, there are “storytellers,” including this writer, assigned to document the event and its people, as well as develop stories around the data being saved.

 

Volunteers at the MIT Rescue ranged from first-year students to chairs of academic departments. There were climate and health scientists, concerned about data in their fields, and computer scientists who were there to help build tools. And there were the librarians and archivists, for whom preservation of information is a calling. While some attendees were self-interested — their research careers depend upon the data they were rescuing— many came for a broader set of reasons.

 

“I don’t want to see measles killing 1000 children a year like it used to,” said one participant who was focusing on FDA data. Another called data deletion “the modern form of book burning.”

 

In the afternoon, a group of volunteers and leaders met to talk about long-term sustainability. It’s one thing to identify data sets and get them safely archived. But data sets that can’t be found after being rescued are worthless. Along with the data, its metadata— the context and description of the data — is necessary. And lastly, the data’s provenance must be maintained. There has to be a means to prove that the archived data is the same as the data that was originally on a government web site.

 

The librarians and archivists in the group were there to caution the technologists that this wasn’t a new problem, and to remind that the solutions would end up being more complicated than they might imagine. However daunting, this was a discussion that began a transformation of a rescue mission into a barn-building, creating an infrastructure to provide safe harbor for endangered data. As University of Pennsylvania librarian Laurie Allen told the Data Rescue DC group, the vulnerabilities of federal data are not new to the Trump administration, just newly exposed.

 

By the end of the day, six government agencies had primers written along with 16 sub-agencies. Close to 4000 URLs were seeded from the Department of the Interior and Department of Energy. And 53 datasets were harvested, adding 35 gigabytes of data to the archive. The next scheduled Data Rescue Boston will be at Northeastern University on March 24th.

 

Participant interviews were conducted by Amanda Axel.

SAUL TANNENBAUM
+ posts
    This author does not have any more posts.

Filed Under: FEATURES, News, NEWS+OPINIONS, Tech Tagged With: data, Data Refuge, Data Rescue, Donald Trump, EDGI, Environmental Data and Governance Initiative, FDA, Laurie Allen, MIT, Saul Tannenbaum, science

WHAT’S NEW

State Wire: Mass Legislation Aims To Improve Language Access For All

State Wire: Mass Legislation Aims To Improve Language Access For All

State Wire: Mass Launches Free Legal Advice Hotline Regarding Abortion Care

State Wire: Mass Launches Free Legal Advice Hotline Regarding Abortion Care

State Wire: Mass Bill Would Tighten Restrictions On Facial Recognition Technology

State Wire: Mass Bill Would Tighten Restrictions On Facial Recognition Technology

Mass Higher Ed Advocates Urge More Investment In Students 

Mass Higher Ed Advocates Urge More Investment In Students 

Guest Opinion: Promoting Metal Detectors In BPS Perpetuates Dangerous Narrative

Guest Opinion: Promoting Metal Detectors In BPS Perpetuates Dangerous Narrative

State Wire: Mass Leads Nation In 'Green' Building Development

State Wire: Mass Leads Nation In ‘Green’ Building Development

Primary Sidebar

LOCAL EVENTS

AAN Wire


Most Popular

  • Think Massachusetts Cannabis Prices Are Low Now? Just Wait Six Months!
  • A New Beginning For Formerly Incarcerated Women
  • Jerrod Carmichael Has First Show After Coming Out—At the Wilbur In Boston
  • Why Are Cannabis Prices Really Crashing?
  • Dig This: Thousands Of Furries Flocking To Anthro New England 

Footer

Social Buttons

DigBoston facebook DigBoston Twitter DigBoston Instagram

Masthead

About

Advertise

Customer Service

About Us

DigBoston is a one-stop nexus for everything worth doing or knowing in the Boston area. It's an alt-weekly, it's a website, it's an email blast, it's a twitter account, it's that cool party that you were at last night ... hey, you're reading it, so it's gotta be good. For advertising inquiries: sales@digboston.com To reach editorial (and for inquiries about internship opportunities): editorial@digboston.com