Integrate Hathitrust Records Into Your Catalog

Leo Migdal
-
integrate hathitrust records into your catalog

You can build collections from the HathiTrust that you can search separately from the full HathiTrust database, providing the ability to perform focused searches on subsets of HathiTrust materials. These collections can be designated as "public" (available to anyone who goes to HathiTrust) or "private" (available only to you) and created permanently or temporarily. To create a permanent collection, you must login to HathiTrust with your Bear ID and password. Items can be added to collections either from the "PageTurner" interface, where pages of the book are viewed, or from the search results page in "Full-text Search". In both cases, items can be added to a new or existing permanent collection, or a temporary collection (a temporary collection is automatically created if the user is not logged in). To create a permanent collection in HathiTrust, do the following:

In the future if you want to edit, change the private/public setting, or delete the collection, your collections will always be listed in the "My Collections" tab whenever you are logged in to HathiTrust. With this method, you can select single items to be added to your collection by going to the PageTurner for specific titles. There was an error while loading. Please reload this page. Finding New York City directories in Hathitrust is a little like herding cattle: you find the first bunch quickly, and then spend the rest of your time rounding up strays. Instead of a long list of individual directories, searching Hathitrust yields a series of record sets into which the individual directories have been organized.

A record set often holds multiple directories, but sometimes contains only one, and a single directory series series may appear across multiple record sets. It's at this point that the search process begins to feel like a long, lonely cattle drive. So before I could proceed, I first had to create my own mini-catalog: a little slice of Hathitrust, stored in a local data frame and dedicated to this project. My local catalog helps me keep track of what I've found so I don't waste time rounding up the same directories over and over again. I also use my local catalog to store bibliographic data associated with each directory--things like title, originating repository, and publication year. And as I convert the text pages stored on Hathitrust into a multi-year data frame of address entries, my local catalog helps me keep on top of where I am in the scraping, cleaning,...

Here's an example of what a record set looks like in Hathitrust. And here's a snippet of my current local catalog (I've omitted some columns for readability): I've already gathered the id numbers for nineteen record sets at Hathitrust, but there's every reason to think I'll find more in the future, so I wanted a script that could mine Hathitrust, store... Zephir, the HathiTrust bibliographic metadata management system, is managed by CDL’s Discovery & Delivery team. In this advice column, Barbara Cormack, the metadata analyst for Zephir, answers common questions for contributing records to Zephir. While these questions were written by fictitious authors, you are welcome to submit your questions to Zephir (email: zephir-help@ucop.edu).

I keep hearing that records in HathiTrust are grouped together, or clustered, but I also keep seeing what look like multiple records for the same work or title in the catalog. Can you explain exactly how records get grouped together in the HathiTrust catalog? Thanks for your inquiry, it’s a great question and something that comes up fairly regularly. I hope you’re prepared to get “down in the weeds” a little in order for me to explain how this works in Zephir. Records submitted to Zephir go through two phases of processing. In phase one, the records are “prepared” for ingest.

During this phase, some of the metadata in the record is validated and manipulated, the record is assigned to a cluster, and it receives a cluster identifier (or “CID”). Zephir does the cluster assignment using one of three possible data elements: First, if the incoming record has an OCLC number, Zephir will determine if that OCLC number is in the database already. If the OCLC number is present in the database, Zephir will assign the incoming record to the existing cluster using it. No further matches are attempted.

People Also Search

You Can Build Collections From The HathiTrust That You Can

You can build collections from the HathiTrust that you can search separately from the full HathiTrust database, providing the ability to perform focused searches on subsets of HathiTrust materials. These collections can be designated as "public" (available to anyone who goes to HathiTrust) or "private" (available only to you) and created permanently or temporarily. To create a permanent collection...

In The Future If You Want To Edit, Change The

In the future if you want to edit, change the private/public setting, or delete the collection, your collections will always be listed in the "My Collections" tab whenever you are logged in to HathiTrust. With this method, you can select single items to be added to your collection by going to the PageTurner for specific titles. There was an error while loading. Please reload this page. Finding New...

A Record Set Often Holds Multiple Directories, But Sometimes Contains

A record set often holds multiple directories, but sometimes contains only one, and a single directory series series may appear across multiple record sets. It's at this point that the search process begins to feel like a long, lonely cattle drive. So before I could proceed, I first had to create my own mini-catalog: a little slice of Hathitrust, stored in a local data frame and dedicated to this ...

Here's An Example Of What A Record Set Looks Like

Here's an example of what a record set looks like in Hathitrust. And here's a snippet of my current local catalog (I've omitted some columns for readability): I've already gathered the id numbers for nineteen record sets at Hathitrust, but there's every reason to think I'll find more in the future, so I wanted a script that could mine Hathitrust, store... Zephir, the HathiTrust bibliographic metad...

I Keep Hearing That Records In HathiTrust Are Grouped Together,

I keep hearing that records in HathiTrust are grouped together, or clustered, but I also keep seeing what look like multiple records for the same work or title in the catalog. Can you explain exactly how records get grouped together in the HathiTrust catalog? Thanks for your inquiry, it’s a great question and something that comes up fairly regularly. I hope you’re prepared to get “down in the weed...