Text and data mining tools available for research
Access copyrighted materials for data analysis through the UD Library, Museums and Press
The traditional method of studying literature is through careful, critical and complete reading. However, there are other ways to learn from this material.
For example, researchers can trace words, patterns and trends throughout books to foster more data-driven exploration. They can research questions like, “How does Shakespeare’s language vary by genre?” or they can explore the Bard’s word choice and frequency to discover how his style compares to that of other authors. While traditionally associated with business and the sciences, this type of data analysis extends well into humanities research.
One way the UD community can access tools for such computational analysis and text mining is through HathiTrust, a digital library that preserves and provides access to more than 16.7 million digital copies of books and journal articles from research libraries across the world.
Available through the UD Library, Museums and Press website, the HathiTrust database has easy-to-use computational tools suited for beginners, as well as complex tools for advanced data analysis. The HathiTrust Research Center recently expanded the breadth of materials researchers can use with these tools to include all materials from its collection, including those protected by copyright.
With access to the entire, growing collection of titles in HathiTrust, UD researchers can bolster the quality of their work and strengthen their overall research through techniques of data mining and computational analysis.
To learn more about which tools could work for your research, including other available resources beyond HathiTrust, schedule a consultation with a librarian.
For further information, please visit: https://library.udel.edu/ris/