Using the New Workflow for Improved MARC Records from the OhioLINK ETD Center

Title slide
Dec 1, 2021

Join us to learn about the new process for downloading improved MARC records for OhioLINK ETDs.  The new process uses the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) via the free cataloging tool MarcEdit. During the session Emily Flynn will demonstrate the new workflow and showcase the improvements to the MARC records.

Transcription (select to toggle opened/closed)

Emily Flynn Good morning and welcome to the webinar this morning using the new workflow for improved MARC records from the OhioLINK ETD Center. I don't have a slide up for myself, but I am Emily Flynn, metadata and ETD coordinator at OhioLINK. So if you do anything with ETDs or have ever put in an ETD ticket or, on the flip side, dealt with cataloging, you've probably spoken or worked with me. So this morning, I'm going to go through the quick agenda here, so I'm going to review the new workflow using OAI-PMH and MARCEdit. I'll give an overview, do a walk through, talk about the documentation we have and then I'll review the improved MARC records themselves and open it up for some Q&A. This is a recorded presentation, so we'll also be posting it with the documentation at a later date as well. So for the overview, the new workflow uses Open Archives Initiative Protocol for Metadata Harvesting. It's also known as OAI-PMH. This is a process that allows access to certain ETD information in our OhioLINK ETD center. And what that means is that the ETD data can be used in various ways, and it's really easy for our developers to update. So we've decided to add in a MARCXML template that can be used with this OAI-PMH. And it's very easy to generate the MARC records. In order to get the records, though, you need a tool. So MARCEdit is a great one. It's a free cataloging tool by Terry Reese. If you've not used it before, it has a lot of capabilities, but is most known for its batch editing. It's very powerful. You can edit a whole set of records, very specifically by a field or sub field, with very particular information in it. You can add a field to the entire set. There's also assign tasks that act like text strings in OCLC Connexion if you're used to using that. So it's a really great tool. I have the download link here, but it's really easy to Google as well. And Terry also has a help page that contains links to how to sign up for the listserv for MARCEdit. There's a knowledge base. There's also YouTube tutorials, and I think it was earlier this year he revised and added to those as well. He also has a direct email for the tool, too, and he's very prompt, usually in answering those if you do have a direct question for him. So if you don't have it, you'll need to install MARC at it before you can use this process. Just a few things of note before we get started looking at the new workflow in the records themselves, each institutional submission site is harvested separately using the prefix of the accession number. So if your institution has multiple submission sites, which you can find out using the OhioLINK ETD public site, whatever that number is at the beginning for an ETD that is what you will use. So the prefix is usually alpha characters, so it'll say OSU and a bunch of numbers, and it's that OSU part that you would use to harvest OSU records, for example. The new workflow itself provides MARC Records for any updated ETD metadata, also known in cataloging land as overlay records together with the new records. So an example of this would be a local ETD administrator adding or extending an embargo date, which would update the information or something like making a correction to the author's name or title, which would also update the ETD. And then it would get provided out as a new as an updated record alongside the new records. There is a way to sort these out and that is already explained in the documentation we have. Also, dates matter, so we had a really great update to the ETD Center on August 2nd, 2021. That also meant that all the ETDs technically got updated as well. And so the update date for all of the ETDs is currently August 2nd. So that means when you're harvesting, you need to choose August 3rd or a more recent date to just get the newer stuff. If you leave no date or leave it blank, it'll give you all the MARC records. Or if you do August 2nd, you will get everything as well. But again, there's a way to sort those out based on date, if you need. So I'm going to pause for a minute, but I'm going to take questions at the end, so make note of anything but just in case I get to it during the walk through. I'll wait until I look at the chat. So for this morning's walk through, I'm just going to use some screenshots, but I will show all the steps along the way. When you open MARCEdit, you'll see a landing page and this can be customized in the preferences. These are the things I most frequently use, and so this is why my screen looks like this, but you can change the icons based on what you want to do with it. You're going to start by going to tools. And then OAI harvester tools and harvest OAI records. This batch harvester option below it is a way to set up a reoccurring one. But today we're just going to look at the manual one off harvest. When you click it, it'll open a pop up window. This server address is in the documentation, so you can either copy and paste it in or carefully type it in. And usually these save, so that's why there's the drop down window in it, in any consecutive harvests, you should just be able to pull it down and select it from the list. The prefix from the accession number is what's known as the set name. So again, I'm just going to put in OSU. Which is going to give me some OSU MARC records. The metadata type is MARC21 XML. And that's directly tied to the crosswalk path. So you don't have to touch this. And if you choose something else from the dropdown like Dublin Core, it'll automatically update the crosswalk path so you don't have to worry about it, but it is tied into the metadata type, so you might see a change if you were to choose something else. Again, based on the dates I'm going to put in August 3rd, which will give me all of the catalog records for OSU ETDs that were published new or updated on August 3rd or later. And then I'm going to click, OK. At this point, you'll see a little progress bar. And it's going to move back and forth as it pulls records, you'll see the screen bar going up and down until it completes. And once it's done, you'll see the screens change a little bit. We get a results pop up window that'll tell us some information, and then you can see the edit screen behind with the MARC records showing. So in this particular harvest, we got seven hundred and ten records, it processed three resumption tokens, which means it had multiple batches that it went through. And it does give a last resumption token based on what I've tested and what I've confirmed with Terry Reese, who created and maintains MARCEdit, MARCEdit is pretty smart about resumption tokens, and so we'll run through that last one. It'll give it to you, but it's already grabbed everything, so you shouldn't need to use it. There is further information in the documentation if you want more specifics. You can always copy it down and go through the manual process in MARCEdit to follow resumption tokens. But from what it sounds like and what I've seen, you don't actually need to run it because MARCEdit will run it one last time to get your remaining records. It's just to help when your batches are split, but it tries to keep it together. So I'm going to go ahead and close. And at this point, you could make any changes you want in tools or edits. They have a lot of of some of the capabilities that I mentioned earlier, but right now I'm just going to create a MARC file. What we're looking at is known as a .MRK file, because it's readable to us. But what we want it to be is a .MRC file in order to load it into the catalog or share it with others. So it's different from save and save as we're going to go down to compile file into MARC. At this point, you'll get a save window, choose wherever you want to save it. And then in the OhioLINK offices, we prefer UTF-8 MARC file. But there are other options. So you select that type in your name and click Save. And then you'll get your record file, which is the .MRC version of it. As I mentioned, this particular file can be loaded into your library catalog or shared with whoever, if you need to send it to someone else. This is the version for the computer itself. I'm going to pause again, to give you any chance to write down any questions. So at this point, I wanted to show you the documentation that we have available. And then once this recorded webinar is ready as well, it'll also be posted with the documentation. So just to be clear about where this lives, we're going to start on the OhioLINK homepage, you scroll down to the campus staff area and click, Ostaff. From there, you'll see the ETD Center documentation. You click that and it'll take you to this page that has lots of release notes, the roster for the local admins, what manuals are available. And then down here in this other documentation area is where we have put the new cataloging process. So to show this better, I have put the names of the documents here. They are called OhioLINK ETD Center OAI-PMH Cataloging Records. Quick Start Guide which is a concise write up with limited screenshots. And then there's also the full manual itself. It's the same information, but with lots of screen shots, so every step of the way you'll see what you should be seeing so you can follow along. And this is really great for those who have not used MARCEdit before or if there's a step you're getting tripped up on or not sure about the manual will show you all the screenshots. Both the Quick Start guide and the manual also have additional information on how to sort MARC records using MARCEdit. And how to pull out and separate the overlays from the new based on a couple of the date fields in the records, if that's what you would like to do. There's also a little like a tips and tricks section as well, and we tried to cover as much information to help people and make sure you can use either of the Quick Start guide or the manual and be off and running with just that information. So we tried to make it make sure it covered all the things it needed to so that hopefully you won't have any questions. Of course, you can always email support if there's something you need help with, but hopefully that'll get you going. I also have a video, the ETD Center Cataloging Using MARCEdit and OAI-PMH video. I made this this fall and it's a five minute demo. It goes through a quick pass of the process itself and then a slightly slower second one, where I explain a little bit more. I run it in real time and pull records. So if that's helpful, you can certainly come back to this webinar. But if you just want to take a look at the process again, you probably want to go check out this demo. So at this point, I'm going to show you the improved MARC records. For anyone who hasn't used MARCEdit, I wanted to include a couple of screenshots just to show you what it looks like because it's a bit different than looking at catalog records in a different tool. They all seem to be slightly different in how they present. MARCEdit it starts with an equal sign for each field, which is crucial. Then there are two spaces and either the field begins or you start with the indicators. The delimiter is also a dollar sign. And I think those are the biggest things to note here. So this is the top portion of the record. The same record continues with the abstract in the 520. And then we get the rest of the remaining fields. So this is a very quick look at one full record in MARCEdit. This is not a comprehensive list, but just to give you an idea of the upgraded and revised MARC records, we have added fields and subfields, specifically the 655 the 245b. It is now pulled out, the sub field from the main field, it used to just be smushed together in the MARC records from our tool and the 856 now says OhioLINK in the delimiter three. We've improved punctuation throughout, but you'll particularly notice it in the 100 field. So whether or not certain sub fields get a period or a comma based on what comes after it, that should now be accurate. The same with the 245. And again, in other various fields, we've make sure we follow those rules a little bit better, we were able to put that in the template. We've also updated fields and indicators, especially for the 040, the 300 now says approximately for the number of pages since again the student puts that in and it might depend on if there's preliminary pages or not that have been counted. There's a 588 note that got updated. And also, we rearrange the note order to follow more cataloging standards. The new workflow also meets the current national and international RDA standards. With the system upgrade, we also handle diacritics better, although you still might see some issues and HTML tags come through to the MARC records as well. And there are more updates in addition to these, but this is just a taste of what you'll see. As a reminder, with this new and improved workflow, we will be retiring the older ETDCat tool in a couple of weeks this month in December 2021. So if you want to pull any new records, you'll need to use this new process. And finally, I just want to give thanks to our OHTECH SI developers. Without them, we wouldn't have an ETD Center. This is a homegrown system and they've built it, and now they've revised it to ETD Center 3.0. And along with it, they were able to create the new template for the OAI-PMH feed to make sure we have the best possible MARC records we can give you. They're still preliminary, but hopefully they're more robust and you can make better use of them if you would choose to. And also, thanks to the member catalogers who helped out Joan Milligan for University of Dayton, Sevim McCutcheon from Kent State University, and Peter Lisius, I'm sorry if that's not a correct pronunciation, also from Kent State University. Joan, Sevim and Peter were a great help. They not only helped figure out all the specifics of, you know, making sure the punctuation was correct and also what fields need to be added or adjusted. They also helped me with the new documentation to make sure that the MARCEdit instructions were clear and that we captured a majority of the contingencies or questions that people might have. So many thanks to them for working with me over the course of many months to get this all ironed out as well. So at this time, I'm going to open it up to questions. I saw a few things in chat, so I'm going to open that up, if you would like to. Just a reminder, this is a recording. But if you would like to unmute and ask a question, feel free to do that as well. Otherwise, just put your question in the chat. So I'm going to go ahead and open up the chat here to see what we have. So, Marty, says Marty Jenkins says one, MARCEdit thing that has tripped me up is delimiter a has to be explicitly labeled. It's not assumed, as in Sierra. Correct. So a field that needs to start with delimiter a or, in this case, dollar sign a, you'll see that at the beginning of those particular fields. The MARC records that we generate should have them all. You might have noticed this in the fixed fields. They don't start with any delimiter a's, and that should just take as well. That should be OK. So what's provided should be all right, but again, if you add in a field. For our typical vendor batches, Erin and I always add in a 910 withthe batch name in delimiter a. And so when we add in the new record, add in the new field for that record, we would specifically tell MARCEdit two blank indicators. So slash slash immediately followed by dollar sign a. And then we would give the file name as we normally would and then hit insert field and it would put it in all the records. But you're right, Marty. If you do add any fields, you'd need to say dollar sign a to start off the field. And Michelle from Xavier asks, is anyone using this process to create records in OCLC? So feel free to jump in and let Michelle know in chat if you are. This is pretty new process, so I don't know how many people have tried it out yet. However, we did include brief instructions on how to take these records and load them into OCLC. We specifically did a Connexion Client since that's what we mostly use. But you can load a file in and it will pop it in OCLC into a file for you. And from there you can save it and generate OCLC numbers if needed. And Libby from BGSU says, yes, I've used OCLC for this process. Great. Thanks, Libby. Oh, and Greg Jones has as well. So you can check out documentation, but it looks like there are a couple of people who have been able to use it with OCLC. Masha asks are there, are the records for embargo ETDs included in the batches or are they included as they become available? That is a fantastic question. The records are for each ETDs that are published. So any of them, including embargoed ones, will have a record. We were able to add embargo notes into those records, so now you'll see a field that says this ETD is delayed until such and such a date. So we've put the notes into the records, and if you wanted to separate them out, you could based on searching for that particular note. And Masha says, excellent, thank you. Yeah, that was a great question that came up when I was working with the member catalogers of could we include that? And yes, we can. So we were able to pull that data in. Are there any further questions? Hopefully, you have found this webinar helpful going through this information and showing you the new workflow process. If you have any questions after the meeting since I don't see any coming in anymore feel free to email myself or to email that way Erin or I could field the questions or someone else in the office, depending on if it's about MARCEdit or anything else. All right, since I don't see any other questions. Thank you for attending today, and I hope you have a good rest of your week. Thank you, everyone.


User login

Enter your OhioLINK staff username.
13 + 0 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.