Ryan E. Johnson is a Metadata and Data Curation Specialist with UC San Diego’s Geisel Library. He works on the metadata for digital collection in the UC San Diego Digital Collections, also known as the DAMS (Digital Asset Management System). Besides working with Special Collections and the Digital Library Programs, he is a member of the Research Data Curation Program, where he provides expertise in metadata and solicits information from UCSD researchers about their data.
Originally from Los Angeles, Ryan received his Master’s from Syracuse University in Library and Information Science, with a specialization in Digital Libraries. While currently preoccupied with the world of metadata, he is interested in data cleaning and transformation, addressing the challenges of publishing Linked Data, and collaborating with library information technology professionals to build software that provides users with excellent information.
Metadata Librarian and Data Curation Specialist, UC San Diego
Rank: Assistant Librarian
2014 - current
As a member of the Digital Object Metadata unit in the Geisel Library, I clean, map, and transform metadata from data providers to align with the data model of our Digital Collections, also know as the DAMS (Digital Asset Management System). I am the sole Metadata Analyst on dozens digital collections, ranging from Special Collections papers to protein sequencing analysis research. I helped to refine and plan new data models for the DAMS that better take advantage of Linked Data best practices, and decisions from communities (for example the Samvera and Fedora communities) have agreed upon.
I also helped streamline digital collection publication, which went from a project-specific customization that took on average many months (sometimes a year), to a standard process taking mere weeks. This streamlining required planning and collaborating with IT to develop ingest tools that could handle MARC, AT, and Excel source metadata into what we refer to as ‘input streams’.
For context, the UC San Diego DAMS is composed of a custom backend, Blacklight, and Solr. We are transitioning to a new stack based on Fedora 4 and Hyrax. The metadata I most commonly work with is serialized in RDF/XML, but I am comfortable in many different serializations of RDF. I work with many sources of data, including MARC, JSON (via data from APIs), XML, and tabular data. I utilize python tools such as Jupyter Lab and
pandas in order to work with data cleaning and transformation at scale, and to enable reproducibility.
Metadata Assistant, Hamilton College
As part of the Digital Humanities Initiative (DHi), I provided metadata expertise to faculty’s Digital Humanities projects, which included work with MODS, RDF/XML, and encoding text into TEI. Hamilton was one of the early adopters of Islandora and the Fedora/Islandora/Drupal stack. I helped develop metadata entry forms, based on an Islandora module, aimed at researchers for self-deposit of their metadata, which required XPath/XQuery knowledge and collaboration with the Islandora community.
Metadata Assistant, Cornell Institute for Social and Economic Research (CISER)
As an intern, I was tasked with providing metadata for CISER’s burgeoning online data portal. Harnessing the physical codebooks on site, and reconciling to online sources like ICPSR. During my short internship, I vastly increased the amount of metadata present in the repository.
Quality Assurance Tester, PC Games Division, THQ Inc.
As a quality assurance tester for the now-defunct video games publisher THQ, I tested PC games and logged any errors in a database, taking note of local environment and steps to reproduce the bugs. During ‘crunch’ times, I worked overtime and even doubletime in order to meet publishing deadlines. I was offered a permanent position, but declined in order to return to my education.
M.S., Library and Information Science (2012)
California State University, Northridge
B.A., Interdisciplinary Humanities (2009)
University of California, Santa Barbara
Other Professional Activities
- Metadata, Second Edition, by Marcia Lei Zeng and Jian Qin
I helped copy edit and add content to each chapter in the book.
Technical Skills and Expertise
- Expertise in multiple metadata standards and frameworks, including MODS, MADS, METS, RDF (and serializations), EAD, schema.org, as well as knowledge of application profiles
- Experience creating metadata standards, ontologies, application profiles, and data modeling
- Knowledge of linked data theories and methods, especially as it relates to library data
- Data cleaning and metadata enrichment expertise
- Data transformation and enhancement through OpenRefine, APIs (parsing JSON), SPARQL queries, and XSL-T
- Basic scripting and regular expression knowledge (Python, shell)
- Creating static web pages for documentation via GitHub Pages and ReadTheDocs, harnessing Jekyll, Markdown, GitHub, etc. for rapid and iterative deployment
- File management and version control software (git and GitHub), especially as it relates to data management and collaboration
- Basic LAMP (also Nginx) server administration, with PHP and shell scripting knowledge relevant to security and automation
- Experience in the Fedora/Islandora/Drupal stack as well as the Fedora/Samvera/Blacklight stack; involved in the Samvera and Hyrax communities
- Archival-quality digitization, scanning, and description experience
- Novice database administration experience (MySQL, SQL Server)
- Comfortable with software testing in support of rapid deployment and agile/iterative design methods as well as management software (JIRA, Redmine)
- Experience with many Linux (esp. Ubuntu) flavors, comfortable with virtual machines and environments (VirtualBox, Vagrant, Docker, Anaconda)