6,000 Puritan Works Available for Free

Logan · Feb 3, 2023

@davejonescue has been hinting toward pieces of this for a while but the three of us (Dave, Logan, and Alex) have been working somewhat in secret for a while and now it's finally time to share the fruits of our labor.

Many people are aware of the Early English Books Online website, which transcribed, character by character, early English books, among which Dave was able to find about 860 Puritans with around 6,000 works.

One downside (for many purposes) is the non-standard spelling of the era:

What if we could automate the correction of these? Alex had already started doing this on his own projects and had compiled a list of about 6,000 words and their corrections. With this basis, I wrote a script to identify an additional 17,000 of the most commonly occurring non-standard spellings, and then Alex and I painstakingly assigned them corrections.

Then I ran all 860 authors, 6,000 works through another script I'd written and the result compared to above looks like this:

So all of these Puritan works are now significantly "cleaner" than they were. We packaged these up into a customized application and would like to now reveal "Puritan Search", a free application for searching through and utilizing all of these documents. Not all are equally useful but there are some real gems that are now available to the general public and it's available, right now, for free on Windows, Mac, and Linux:
www.puritansearch.org

....But wait, there's more! In addition to the free searching application we have created above, and in the interest of making these available as a cleaner base text to readers or publishers, I have also converted each of these to PDF, EPUB, and Word. I hope that despite the lack of perfection, these will save a lot of effort by current or aspiring publishers, while the layman has immediate access to something that is serviceable for immediate reading.

These are available here:
https://sites.google.com/view/project-puritas/home

This was a team effort with months and months of effort to make happen and we pray it will be a blessing to the church worldwide for years to come. By all means share and spread the word.

Phil D. · Feb 3, 2023

Thank you, brothers!

davejonescue · Feb 3, 2023

There really is so much you can do with the Puritan Search software. What differentiates this from previous softwares and currently available indexes are several things.

First, if we were to compare it to the Puritan Hard Drive, the things that would set Puritan Search apart for one is its cost. It is absolutely free. This means pastors of congregations are under no obligation to pay for a copy for each one of their congregants if they chose to disperse this among their congregations. Missionaries do not have to pay for multiple copies to give to the Pastors and people they are training in underdeveloped nations. Professors and students do not need to pay to use the software in their research. Laymen and laywomen do not need to pay to use this. Also, since this software is powered by corrected hand-transcribed texts, instead of OCRed facsimiles, the search results are much greater, and many times more accurate than what could be provided in the Puritan Hard Drive. Also, this isnt a combined work of several media types, from several Reformed segments in church history. This only deals with Puritan and Non-Conformists texts from 1550-1700. Since facsimiles are almost impossible to OCR (which is what initiated the TCP to begin with) this software cant help but give better results.

Secondly, compared to the only other database I know of regarding the Puritans, and that is PRTS's Perkins Library Puritan Studies Index; Puritan Search is powered by primary source material instead of secondary source material. And, in Puritan Search, no access to any of the documents is off limits; while much of PRTS's texts in their Index you need institutional approval. Basically, the PRTS's Index is others writing about the Puritans, while Puritan Search, is the Puritans writings themselves. This isnt to shade PRTS's Index, for both are needed, but clarify that this is different in that it only uses primary source material.

www.puritansearch.com

Ryan&Amber2013 · Feb 3, 2023

This is absolutely amazing and incredible! Thank you so much for your labors! It is so neat to have such works preserved.

biblesword · Feb 3, 2023

That’s great! If you wanted to update to modern English it would be interesting to see how chatgpt would handle the task: https://community.openai.com/t/is-there-an-api-for-chatgpt3/23871/5

Logan · Feb 3, 2023

biblesword said:
That’s great! If you wanted to update to modern English it would be interesting to see how chatgpt would handle the task: https://community.openai.com/t/is-there-an-api-for-chatgpt3/23871/5

Sure, or auto-translate to other languages, or convert to audiobooks. While undoubtedly imperfect, there are many exciting possibilities!

davejonescue · Feb 3, 2023

biblesword said:
That’s great! If you wanted to update to modern English it would be interesting to see how chatgpt would handle the task: https://community.openai.com/t/is-there-an-api-for-chatgpt3/23871/5

The problem with chat auto-correcting to modern language is it has a limit to what it will convert. It would be a very tedious task to go 10 pages at a time. We are talking about over 500,000 pages of text. In my opinion, it would be much easier to have someone like Logan, who is a tech whiz, create a script based on a contemporary dictionary, run the works, produce a list of words not included, then create a correction list like what was used in correcting the EEBO-TCP docs initially. Then re-run a script, replacing those words with their contemporary counterpart over the entire corpus. Its like one of Logans favorite sayings "dont do manually what you can automate."

biblesword · Feb 3, 2023

davejonescue said:
The problem with chat auto-correcting to modern language is it has a limit to what it will convert. It would be a very tedious task to go 10 pages at a time. We are talking about over 500,000 pages of text. In my opinion, it would be much easier to have someone like Logan, who is a tech whiz, create a script based on a contemporary dictionary, run the works, produce a list of words not included, then create a correction list like what was used in correcting the EEBO-TCP docs initially. Then re-run a script, replacing those words with their contemporary counterpart over the entire corpus. Its like one of Logans favorite sayings "dont do manually what you can automate."

You’re right: Chatbot would be very tedious hence I mentioned the api which could be automated. The api for chatbot’s model (I think chatgpt-3?) is just now being released. I think previous versions are available.

Logan · Feb 3, 2023

davejonescue said:
The problem with chat auto-correcting to modern language is it has a limit to what it will convert. It would be a very tedious task to go 10 pages at a time. We are talking about over 500,000 pages of text. In my opinion, it would be much easier to have someone like Logan, who is a tech whiz, create a script based on a contemporary dictionary, run the works, produce a list of words not included, then create a correction list like what was used in correcting the EEBO-TCP docs initially. Then re-run a script, replacing those words with their contemporary counterpart over the entire corpus. Its like one of Logans favorite sayings "dont do manually what you can automate."

So I actually do sort of have a good chunk of that already. As I identified words, I also created a "supplemental dictionary" which was a list of correctly spelled, but archaic words. "Sufficeth", "Betrayeth", etc. So you'd just need to go through and add a suitable replacement to each of these that are already identified and then run the script again.

JOwen · Feb 3, 2023

This project is simply amazing. I thank the LORD for this work. What a gift you have given to the church.

Taylor · Feb 3, 2023

This is so fantastic. We owe you three a great debt.

Knight · Feb 3, 2023

I attempted to download the works and received the following error message - is it because the file size is too large?

davejonescue · Feb 3, 2023

Knight said:
I attempted to download the works and received the following error message - is it because the file size is too large?

View attachment 10023

Please try again. Just tested and all links were working on my end.

Knight · Feb 3, 2023

davejonescue said:
Please try again. Just tested and all links were working on my end.

Same result. I'll try with a different computer later.

davejonescue · Feb 3, 2023

This is a little more detailed video of the programs capabilities as we wanted to keep the general introduction short and sweet.

Polanus1561 · Feb 3, 2023

Is the Puritan Hard Drive out of business with this?

Logan · Feb 3, 2023

Polanus1561 said:
Is the Puritan Hard Drive out of business with this?

I wouldn't think so. The original PDFs are still very valuable as source and while there is some overlap between works, I definitely know of some on the PHD that are not part of this.

But if you've ever tried to use PHD, it's extremely clunky and not really anything like this. So the perpetual "only 24 hours left on sale" and website from 1995 will still likely be around for a while

davejonescue · Feb 3, 2023

Polanus1561 said:
Is the Puritan Hard Drive out of business with this?

I do not know. Of what I could research, their 10,000+ resources were only comprised of about 2,500 texts, with the rest being audio and video. For some, a neatly indexed work of Reformed sermons, conferences, lectures, and videos may be worth the $200 they ask after their weekly sales. This resource is kind of different, in that it is only text, about 6,000 in total, and it only deals with Puritan and Reformed Non-Conformists from the period of 1550-1700. Also, there are no facsimiles included which makes all 500,000+ pages or so of text entirely searchable.

What we really aimed to do with this software is offer an index that could be dispersed globally, at no charge. Even the Encyclopedia Purtanica, a software with about 250 fully searchable works, is $99 a copy. For a small church, who would want to bless their congregation with copies of this, for missionaries, for pastors in undeveloped nations, for students, etc. That can add up quickly. Our goal is to offer Puritan literature to all, in a way that none would be hindered by cost. For a body of work that is almost entirely in the public domain, it has become quite the market; and while us in the West hardly notice, much of it being considerably low cost comparative to our medium wage, for others in the world, a single Puritan Paperback can seem like a treasure. We not only wanted to create something that could benefit the extended research into Puritan thought, by simultaneously being able to glean from their corporate mindset on any given topic; but also provide a "complete" library for those in other countries who for all other purposes would miss out on these works by 1. their reprints being too expensive for them, 2. the lack of availability to institutional access to the original facsimiles, and 3. the extreme expense for even a generous Westerner to ship books to say somewhere like Africa or India in any way for someone to garnish a considerable library; let alone on a mass scale.

With this tool, not only can the searchable index fit on a standard 16gb flash drive, but so can the full body of works in EPUB, PDF, HTML, and DOCX format. As well as both options being able to travel digitally and forgo shipping cost by being downloadable over the web. So that we are extending the work EEBO-TCP started, and again offering it to the world free of charge, with no other hopes but the glorification of God and the edification of his church.

This is not to tarnish or degrade those that put in the long hours and hard work to bring us cleanly published Puritan works, not in any way, we are greatly indebted to those that do; but, this is to advance the study of Puritanism in a way that has yet to be available, and to make available Puritan works to those who by economic strain would not be able to access reprints and contemporary published editions either way.

Ryan&Amber2013 · Feb 3, 2023

Logan said:
I wouldn't think so. The original PDFs are still very valuable as source and while there is some overlap between works, I definitely know of some on the PHD that are not part of this.

But if you've ever tried to use PHD, it's extremely clunky and not really anything like this. So the perpetual "only 24 hours left on sale" and website from 1995 will still likely be around for a while

Haha I know the sales gimmick you are talking about.

Logan · Feb 3, 2023

Ryan&Amber2013 said:
Haha I know the sales gimmick you are talking about.

That week-long sale is 20 years old and still going strong!

Taylor · Feb 3, 2023

Logan said:
That week-long sale is 20 years old and still going strong!

As a child, I used to wonder: "How do these companies know when their TV commercials are playing in order to make sure I didn't call more than five minutes after it ended?"

Regi Addictissimus · Feb 3, 2023

Great work, guys. I shared this with my team earlier.

Logan · Feb 3, 2023

Regi Addictissimus said:
Great work, guys. I shared this with my team earlier.

I was planning on writing something up for Jay but glad word is getting around. I particularly envisioned you guys being able to make good use of the .docx files as a base text for future work.

Jonathco · Feb 3, 2023

Wow guys, this is fantastic. Nice work!

retroGRAD3 · Feb 3, 2023

Thanks y'all

alexanderjames · Feb 3, 2023

Thank you all. It really is quite astounding what has been achieved.

I am wondering if for the EPUB format say, the individual files could perhaps be grouped into folders by author or alphabetically? Does anyone know how to do this?

Logan · Feb 3, 2023

alexanderjames said:
Thank you all. It really is quite astounding what has been achieved.

I am wondering if for the EPUB format say, the individual files could perhaps be grouped into folders by author or alphabetically? Does anyone know how to do this?

I'm not sure what you mean, something different than what is already there? They are named first by the author, then the title. Are you meaning that you want all the Abbot_Robert files to be in a Abbot_Robert folder?

Logan · Feb 3, 2023

I'll note that the epubs may need a little light tweaking before they are ready to read. I think the converter generates them all as one "page" which could make large files not load well. They might need to be internally split (e.g., by chapters).

Ryan&Amber2013 · Feb 3, 2023

I shared it with my pastors. We are at a PCA with probably 500 or so members, so the potential influence is great. It would be awesome to see them use it in their studies and bless the congregation with it.

alexanderjames · Feb 4, 2023

Logan said:
I'm not sure what you mean, something different than what is already there? They are named first by the author, then the title. Are you meaning that you want all the Abbot_Robert files to be in a Abbot_Robert folder?

Exactly this, yes, and/or perhaps under alphabetical folders "A", etc., (if possible please). But I wouldn't want to take much of your/anyone's time - I had a look online and saw it can be done using a script but I am not savvy enough to work it out myself.

Logan said:
I'll note that the epubs may need a little light tweaking before they are ready to read. I think the converter generates them all as one "page" which could make large files not load well. They might need to be internally split (e.g., by chapters).

I read parts of a couple last night and thought they worked really well! The vast majority of the individual files are small, I might test out the larger ones later today. I'd previously loaded Dave's 'PuritanInn' collection onto my Kindle and these seem to be better formatted.

6,000 Puritan Works Available for Free

Puritan Board Graduate

ὁ βαπτιστὴς

Puritan Board Junior

Puritan Board Senior

Puritan Board Freshman

Puritan Board Graduate

Puritan Board Junior

Puritan Board Freshman

Puritan Board Graduate

Puritan Board Junior

Puritan Board Post-Graduate

Puritan Board Freshman

Puritan Board Junior

Puritan Board Freshman

Puritan Board Junior

Puritan Board Senior

Puritan Board Graduate

Puritan Board Junior

Puritan Board Senior

Puritan Board Graduate

Puritan Board Post-Graduate

Completely sold out to the King

Puritan Board Graduate

Puritan Board Sophomore

Puritan Board Senior

Puritan Board Sophomore

Puritan Board Graduate

Puritan Board Graduate

Puritan Board Senior

Puritan Board Sophomore